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1 

COMPOUNDS AND METHODS FOR DIAGNOSIS OF TUBERCULOSIS 



TECHNICAL FIELD 

The present invention relates generally to the detection of Mycobacterium 
5 tuberculosis infection. The invention is more particularly related to polypeptides comprising 
a Mycobacterium tuberculosis antigen, or a portion or other variant thereof, and the use of 
such polypeptides for the serodiagnosis of Mycobacterium tuberculosis infection. 



BACKGROUND OF THE INVENTION 

10 Tuberculosis is a chronic, infectious disease, that is generally caused by 

infection with Mycobacterium tuberculosis. It is a major disease in developing countries, as 
well as an increasing problem in developed areas of the world, with about 8 million new 
cases and 3 million deaths each year. Although the infection may be asymptomatic for a 
considerable period of time, the disease is most commonly manifested as an acute 

1 5 inflammation of the lungs, resulting in fever and a nonproductive cough. If left untreated, 
serious complications and death typically result. 

Although tuberculosis can generally be controlled using extended antibiotic 
therapy, such treatment is not sufficient to prevent the spread of the disease. Infected 
individuals may be asymptomatic, but contagious, for some time. In addition, although 

20 compliance with the treatment regimen is critical, patient behavior is difficult to monitor. 
Some patients do not complete the course of treatment, which can lead to ineffective 
treatment and the development of drug resistance. 

Inhibiting the spread of tuberculosis will require effective vaccination and 
accurate, early diagnosis of the disease. Currently, vaccination with live bacteria is the most 

25 efficient method for inducing protective immunity. The most common Mycobacterium for 
this purpose is Bacillus Calmctte-Guerin (BCG), an avirulent strain of Mycobacterium bovis. 
However, the safety and efficacy of BCG is a source of controversy and some countries, such 
as the United States, do not vaccinate the general public. Diagnosis is commonly achieved 
using a skin test, which involves intradermal exposure to tuberculin PPD (protein-purified 

30 derivative). Antigcn-spccific T cell responses result in measurable incubation at the injection 
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site by 48-72 hours after injection, which indicates exposure to Mycobacterial antigens. 
Sensitivity and specificity have, however, been a problem with this test, and individuals 
vaccinated with BCG cannot be distinguished from infected individuals. 

While macrophages have been shown to act as the principal effectors of 
5 M tuberculosis immunity, T cells are the predominant inducers of such immunity. The 
essential role of T cells in protection against M tuberculosis infection is illustrated by the 
frequent occurrence of M tuberculosis in AIDS patients, due to the depletion of CD4 T cells 
associated with human immunodeficiency virus (HIV) infection. Mycobacterium-reactive 
CD4 T cells have been shown to be potent producers of gamma-interferon (TFN-y), which, in 

10 turn, has been shown to trigger the anti-mycobacterial effects of macrophages in mice. While 
the role of IFN-y in humans is less clear, studies have shown that 1, 25 -dihydroxy- vitamin D3, 
either alone or in combination with IFN-y or tumor necrosis factor-alpha, activates human 
macrophages to inhibit M tuberculosis infection. Furthermore, it is known that IFN-y 
stimulates human macrophages to make 1,25-dihydroxy-vitamin D3. Similarly, IL-12 has 

15 been shown to play a role in stimulating resistance to M tuberculosis infection. For a review 
of the immunology of M tuberculosis infection see Chan and Kaufmann, in Tuberculosis: 
Pathogenesis, Protection and Control, Bloom (ed.), ASM Press, Washington, DC, 1994. 

Accordingly, there is a need in the art for improved diagnostic methods for 
detecting tuberculosis. The present invention fulfills this need and further provides other 

20 related advantages. 

SUMMARY OF THE INVENTION 

Briefly stated, the present invention provides compositions and methods for 
diagnosing tuberculosis. In one aspect, polypeptides are provided comprising an antigenic 
25 portion of a soluble M tuberculosis antigen, or a variant of such an antigen that differs only 
in conservative substitutions and/or modifications. In one embodiment of this aspect, the 
soluble antigen has one of the following N-terminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Tle-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val-Ala-Ala-Leu (SEQ ID NO: 115); 
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( b ) Ala- Val-GIu-Ser-Gly-Met-Leu- Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID NO: 117); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID NO: 118 ); 

(e) Asp-Ile-Gly-Ser-GIu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-lle-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-AIa-Ser-Pro-Pro- 
Ser(SEQ ID NO: 121); 

(h) Ala-Pro-Lys-ThT-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 
(SEQ ID NO: 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn (SEQ 
ID NO: 123); 

(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala-Ser; 

(SEQ ID NO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQ ID NO: 130) or 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 

(SEQ ID NO: 131) 

wherein Xaa may be any amino acid. 

In a related aspect, polypeptides are provided comprising an immunogenic 
portion of an M tuberculosis antigen, or a variant of such an antigen that differs only in 
conservative substitutions and/or modifications, the antigen having one of the following N- 
terminal sequences: 
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(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 

Asn-Val-His-Leu-Val; (SEQ ID NO: 132) or 
(n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) 
5 wherein Xaa may be any amino acid. 

In another embodiment, the soluble M tuberculosis antigen comprises an 
amino acid sequence encoded by a DNA sequence selected from the group consisting of the 
sequences recited in SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, the complements of said 
sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 1, 2, 
10 4-10, 13-25, 52, 94 and 96 or a complement thereof under moderately stringent conditions. 

In a related aspect, the polypeptides comprise an antigenic portion of a 
M tuberculosis antigen, or a variant of such an antigen that differs only in conservative 
substitutions and/or modifications, wherein the antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
15 SEQ ID NOS: 26-51, 133, 134, 158-178 and 196, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 26-51, 133, 134, 158- 
178 and 196 or a complement thereof under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 
recombinant expression vectors comprising these DNA sequences and host cells transformed 
20 or transfscted with such expression vectors are also provided. 

In another aspect, the present invention provides fusion proteins comprising a 
first and a second inventive polypeptide or, alternatively, an inventive polypeptide and a 
known M tuberculosis antigen. 

In further aspects of the subject invention, methods and diagnostic kits are 
25 provided for detecting tuberculosis in a patient. The methods comprise: (a) contacting a 
biological sample with at least one of the above polypeptides; and (b) detecting in the sample 
the presence of antibodies that bind to the polypeptide or polypeptides, thereby detecting 
M. tuberculosis infection in the biological sample. Suitable biological samples include whole 
blood, sputum, serum, plasma, saliva, cerebrospinal fluid and urine. 1'he diagnostic kits 
30 comprise one or more of the above polypeptides in combination with a detection reagent. 
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The present invention also provides methods for detecting M tuberculosis 
infection comprising: (a) obtaining a biological sample from a patient; (b) contacting the 
sample with at least one oligonucleotide primer in a polymerase chain reaction, the 
oligonucleotide primer being specific for a DNA sequence encoding the above polypeptides; 
5 and (c) detecting in the sample a DNA sequence that amplifies in the presence of the first £ind 
second oligonucleotide primers. In one embodiment, the oligonucleotide primer comprises at 
least about 10 contiguous nucleotides of such a DNA sequence. 

In a further aspect, the present invention provides a method for detecting 
M tuberculosis infection in a patient comprising: (a) obtaining a biological sample from the 
10 patient; (b) contacting the sample with an oligonucleotide probe specific for a DNA sequence 
encoding the above polypeptides; and (c) detecting in the sample a DNA sequence that 
hybridizes to the oligonucleotide probe. In one embodiment, the oligonucleotide probe 
comprises at least about 1 5 contiguous nucleotides of such a DNA sequence. 

In yet another aspect, the present invention provides antibodies, both 
1 5 polyclonal and monoclonal, that bind to the polypeptides described above, as well as methods 
for their use in the detection of M tuberculosis infection. 

These and other aspects of the present invention will become apparent upon 
reference to the following detailed description and attached drawings. All references 
disclosed herein are hereby incorporated by reference in their entirety as if each was 
20 incorporated individually. 

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE IDENTIFIERS 

Figure lA and B illustrate the stimulation of proliferation and interferon-y 
production in T cells derived from a first and a second M tuberculosis-immune donor, 
25 respectively, by the 14 Kd, 20 Kd and 26 Kd antigens described in Example 1 . 

Figures 2A-D illustrate the reactivity of antisera raised against secretory M 
tuberculosis proteins, the known M tuberculosis antigen 85b and the inventive antigens 
Tb38-1 and TbH-9. respectively, with M tuberculosis lysatc (lane 2), M. tuberculosis 
secretory proteins (lane 3). recombinant Tb38-1 (lane 4), recombinant TbH-9 (lane 5) and 
30 recombinant 85b (lane 5). 
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Figure 3A illustrates the stimulation of proliferation in a TbH-9-specific T cell 
clone by secretory M. tuberculosis proteins, recombinant TbH-9 and a control antigen, 
TbRall. 

Figure 3B illustrates the stimulation of interferon-y production in a TbH-9- 
5 specific T cell clone by secretory M tuberculosis proteins, PPD and recombinant TbH-9. 

Figure 4 illustrates the reactivity of two representative polypeptides with sera 
from M. tuberculosis-infected and uninfected individuals, as compared to the reactivity of 
bacterial lysate. 

Figure 5 shows the reactivity of four representative polypeptides with sera 
10 from M tuberculosis-infected and uninfected individuals, as compared to the reactivity of the 
38 kD antigen. 

Figure 6 shows the reactivity of recombinant 38 kD and TbRal 1 antigens with 
sera from M tuberculosis patients, PPD positive donors and normal donors. 

Figure 7 shows the reactivity of the antigen TbRa2A with 38 kD negative sera. 
15 Figure 8 shows the reactivity of the antigen of SEQ ID NO: 60 with sera from 

M tuberculosis patients and normal donors. 

Figure 9 illustrates the reactivity of the recombinant antigen TT5H-29 (SEQ ID 
NO: 137) with sera from M tuberculosis patients, PPD positive donors and normal donors as 
determined by indirect ELIS A. 
20 Figure 10 illustrates the reactivity of the recombinant antigen TbII-33 (SEQ 

ID NO: 140) with sera from M. tuberculosis patients and from normal donors, and with a pool 
of sera from M tuberculosis patients, as detennined both by direct and indirect ELIS A 

Figure 1 1 illustrates the reactivity of increasing concentrations of the 
recombinant antigen TbH-33 (SEQ ID NO: 140) with sera from M. tuberculosis patients and 
25 from normal donors as determined by ELISA. 

SEQ. ID NO. 1 is the DNA sequence of TbRal. 
SEQ. ID NO. 2 is the DNA sequence of TbRalO. 
SEQ. ID NO. 3 is the DNA sequence of TbRal 1. 
30 SEQ. ID NO. 4 is the DNA sequence of TbRal2. 
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SEQ. 


ID NO. 


5 is the DNA sequence of TbRal3. 




SEQ. 


ID NO. 


6 is the DNA sequence of TbRal 6. 




SEQ. 


ID NO. 


7 is the DNA sequence of TbRal7. 




SEQ. 


ID NO. 


8 is the DNA sequence of TbRal 8. 


5 


SEQ. 


ID NO. 


9 is the DNA sequence of TbRal 9. 




SEQ. 


ID NO. 


10 


IS the DNA sequence or TbRa24. 




SEQ. 


ID NO. 


11 


IS the DNA sequence of TbRa26. 




SEQ. 


ID NO. 


12 


IS the DNA sequence of TbRa28. 




SEQ. 


ID NO. 


13 


IS the DNA sequence of TbRa29. 


10 


SEQ. 


ID NO. 


14 


is the DNA sequence of TbRa2A. 




SEQ. 


ID NO. 


15 


IS the DNA sequence or 1 bRa3. 




SEQ. 


ID NO. 


16 


IS the DNA sequence of TbRa32. 




SEQ. 


ID NO. 


17 


is the DNA sequence of TbRa35. 




SEQ. 


ID NO. 


18 


is the DNA sequence of TbRaio. 


15 


SEQ. 


ID NO. 


19 


is the DNA sequence of TbRa4. 




SEQ. 


ID NO. 


20 


is the DNA sequence of TbRa9. 




SEQ. 


ID NO. 


21 


is the DNA sequence or TbRaB. 




SEQ. 


ID NO. 


22 


is the DNA sequence or TbRaC. 




SEQ. 


ID NO. 


23 


is the DNA sequence of TbRaD. 


20 


SEQ. 


ID NO. 


24 


is the DNA sequence of YYWCPG 




SEQ. 


ID NO. 


25 


is the DNA sequence of AAMK. 




SEQ. 


ID NO. 


26 


is the DNA sequence of TbL-23. 




SEQ. 


ID NO. 


27 


is the DNA sequence of TbL-24, 




SEQ. 


ID NO. 


28 


is the DNA sequence of TbL-25. 


25 


SEQ. 


ID NO. 


29 


is the DNA sequence of TbL-28. 




SEQ. 


ID NO. 


30 


is the DNA sequence of TbL-29. 




SEQ. 


ID NO. 


31 


is the DNA sequence of TbH-5. 




SEQ. 


ID NO. 


32 


is the DNA sequence of TbH-8. 




SEQ. 


ID NO. 


33 


is the DNA sequence of TbH-9. 


30 


SEQ. 


ID NO. 


34 


is the DNA sequence of TbM-1 . 
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SEQ. ID NO. 


35 


is the DNA sequence of TbM-3. 




SEQ. ID NO. 


36 


is the DNA sequence of TbM-6. 




SEQ. ID NO. 


37 


is the DNA sequence of TbM-7. 




SEQ. ID NO. 


38 


is the DNA sequence of TbM-9. 


5 


SEQ. ID NO. 


39 


is the DNA sequence of TbM-12. 




SEQ. ID NO. 


40 


is the DNA sequence of TbM-13. 




SEQ. 


ID NO. 


41 


is the DNA sequence of TbM-14. 




SEQ. 


ID NO. 


42 


is the DNA sequence of TbM-15. 




SEQ. 


ID NO. 


43 


is the DNA sequence of TbH-4. 


10 


SEQ. 


ID NO. 


44 


is the DNA sequence of TbH-4-FWD. 




SEQ. 


ID NO. 


45 


is the DNA sequence of TbH-12. 




SEQ. 


ID NO. 


46 


is the DNA sequence of Tb38-1 . 




SEQ. 


ID NO. 


47 


is the DNA sequence of Tb38-4. 




SEQ. 


ID NO. 


48 


is the DNA sequence of TbL-1 7. 


15 


SEQ. 


ID NO. 


49 


is the DNA sequence of TbL-20. 




SEQ. 


ID NO. 


50 


is the DNA sequence of TbL-21 . 




SEQ. 


ID NO. 


51 


is the DNA sequence of TbH-16. 




SEQ. 


ID NO. 


52 


is the DNA sequence of DPEP. 




SEQ. 


ID NO. 


53 


is the deduced amino acid sequence of DPEP. 


20 


SEQ. 


ID NO. 


54 


is the protein sequence of DPV N-terminal Antigen. 




SEQ. 


ID NO. 


55 


is the protein sequence of AVGS N-terminal Antigen. 




SEQ. 


ID NO. 


56 


is the protein sequence of AAMK N-terminal Antigen. 




SEQ. 


ID NO. 


57 


is the protein sequence of YYWC N-terminal Antigen. 




SEQ. 


ID NO. 


58 


is the protein sequence of DIGS N-terminal Antigen. 


25 


SEQ. 


ID NO. 


59 


is the protein sequence of AEES N-terminal Antigen. 




SEQ. 


ID NO. 


60 


is the protein sequence of DPEP N-terminal Antigen. 




SEQ. 


ID NO. 


61 


is the protein sequence of APKT N-terminal Antigen. 




SEQ. 


ID NO. 


62 


is the protein sequence of DPAS N-terminal Antigen. 




SEQ. 


ID NO. 


63 


is the deduced amino acid sequence of TbM-I Peptide. 


30 


SEQ. 


ID NO. 


64 


is the deduced amino acid sequence of TbRal . 
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SEQ. ID NO. 65 




SEQ. ID NO. 66 




SEQ. ID NO. 67 




SEQ, ID NO. 68 


5 


SEQ. ID NO. 69 




SEQ. ID NO. 70 




SEQ. ID NO. 71 




SEQ. ID NO. 72 




SEQ. ID NO. 73 


10 


SEQ. ID NO, 74 




SEQ. ID NO, 75 




SEQ. ID NO. 76 




SEQ. ID NO. 77 




SEQ. ID NO. 78 


15 


SEQ. ID NO. 79 




SEQ. ID NO. 80 




SEQ. ID NO. 81 




SEQ. ID NO. 82 




SEQ. ID NO. 83 


20 


SEQ. ID NO. 84 




SEQ. ID NO. 85 




SEQ. ID NO, 86 




SEQ. ID NO. 87 




SEQ. ID NO. 88 


25 


SEQ. ID NO, 89 




SEQ. ID NO. 90 




SEQ. ID NO. 91 




SEQ. ID NO. 92 




SEQ. ID NO. 93 


30 


SEQ. ID NO. 94 



s 


the 


deduced 


amino 


acid sequence 


ofTbRalO. 


s 


the 


deduced 


amino 


acid sequence 


of TbRall. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRal2. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRaB. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRal6. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRaH. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRalS. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRal9. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa24. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa26. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa28. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa29. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRaZA. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRaS. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa32. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa35. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa36. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa4, 


s 


the 


deduced 


amino 


acid sequence 


ofTbRa9. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRaB. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRaC. 


s 


the 


deduced 


amino 


acid sequence 


ofTbRaD. 


s 


the 


deduced 


amino 


acid sequence 


of YYWCPG. 


s 


the 


deduced 


amino 


acid sequence 


ofTbAAMK. 


s 


the 


deduced 


amino 


acid sequence 


ofTb38-l. 


s 


the 


deduced 


amino 


acid sequence 


ofTbH-4. 


s 


the 


deduced 


amino 


acid sequence 


ofTbII-8. 


s 


the 


deduced 


amino 


acid sequence 


ofTbH-9. 


s 


the 


deduced 


amino 


acid sequence 


ofTbH-12. 


s 


the 


DNA sequence 


ofDPAS. 
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95 is the deduced amino acid sequence of DPAS. 

96 is the DNA sequence of DPV. 

97 is the deduced amino acid sequence of DPV. 

98 is the DNA sequence of ESAT-6. 

99 is the deduced amino acid sequence of ESAT-6. 



100 
101 
102 
103 
104 
105 
106 
107 
108 
109 
110 
111 
112 
113 
114 
115 
116 
117 
118 
119 
120 
121 
122 
123 
124 



s the DNA sequence of TbH-8-2. 

s the DNA sequence of TbH-9FL. 

s the deduced amino acid sequence of l"bH-9FL. 

s the DNA sequence of TbH-9- 1 . 

s the deduced amino acid sequence of TbH-9- 1. 

s the DNA sequence of TbH-9-4. 

s the deduced amino acid sequence of TbH-9-4. 

s the DNA sequence of Tb38-IF2 IN. 

s the DNA sequence of Tb38-1F2 RP. 

s the deduced amino acid sequence of Tb37-FL. 

s the deduced amino acid sequence of Tb38-IN. 

s the DNA sequence of Tb38-1F3. 

s the deduced amino acid sequence of Tb38-1F3. 

s the DNA sequence of Tb38-1F5. 

s the DNA sequence of Tb38-1F6. 

s the deduced N-terminal amino acid sequence of DPV. 

s the deduced N-terminal amino acid sequence of AVGS. 

s the deduced N-terminal amino acid sequence of AAMK. 

s the deduced N-terminal amino acid sequence of YYWC. 

s the deduced N-terminal amino acid sequence of DIGS. 

s the deduced N-terminal amino acid sequence of AAES. 

s the deduced N-terminal amino acid sequence of DPEP. 

s the deduced N-terminal amino acid sequence of APKT. 

s the deduced N-terminal amino acid sequence of DPAS. 

s the protein sequence of DPPD N-terminal Antigen. 
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SEQ ID NO. 125-128 are the protein sequences of four DPPD cyanogen bromide 
fragments. 



SEQ 


ID NO. 


129 


is the 


SEQ 


ID NO. 


130 


is the 


SEQ 


ID NO. 


131 


is the 


SEQ 


ID NO. 


132 


is the 


SEQ 


ID NO. 


133 


is the 


SEQ 


ID NO. 


134 


is the 


SEQ 


ID NO. 


135 


is the 


SEQ 


ID NO. 


136 


is the 


SEQ 


ID NO. 


137 


is the 


SEQ 


ID NO. 


138 


is the 


SEQ 


ID NO. 


139 


is the 


SEQ 


ID NO. 


140 


is the 



30 



SEQ ID NO: 141-146 are PGR primers used in the preparation of a fusion protein 
containing TbRa3, 38 kD and Tb38-1. 

SEQ ID NO: 147 is the DNA sequence of the fusion protein containing TbRa3, 38 kD 
and Tb38-1. 

SEQ ID NO: 148 is the amino acid sequence of the fusion protein containing TbRa3, 
38 kD and Tb38-1. 

SEQ ID NO: 149 is the DNA sequence of the M. tuberculosis antigen 38 kD. 

SEQ ID NO: 150 is the amino acid sequence of the M. tuberculosis antigen 38 kD. 

SEQ ID NO: 151 is the DNA sequence of XP 14. 

SEQ ID NO: 152 is the DNA sequence of XP24. 

SEQ ID NO: 153 is the DNA sequence of XP31. 

SEQ ID NO: 154 is the 5' DNA sequence of XP32. 

SEQ ID NO: 155 is the 3' DNA sequence of XP32. 

SEQ ID NO: 156 is the predicted amino acid sequence of XP14. 

SEQ ID NO: 157 is the predicted amino acid sequence encoded by the reverse 
complement of XP14. 
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SEQ ID NO: 


158 


is the DNA sequence of XP27. 




SEQ ID NO: 


159 


is the DNA sequence of XP36. 




SEQ ID NO: 


160 


is the 5' DNA sequence of XP4. 




SEQ ID NO: 


161 


is the 5' DNA sequence of XP5. 


5 


SEQ ID NO: 


162 


is the 5' DNA sequence of XP17. 




SEQ ID NO: 


163 


is the 5' DNA sequence of XP30. 




SEQ ID NO: 


164 


is the 5' DNA sequence of XP2. 




SEQ ID NO: 


165 


is the 3' DNA sequence of XP2. 




SEQ ID NO: 


166 


is the 5' DNA sequence of XP3. 


10 


SEQ ID NO: 


167 


is the 3' DNA sequence of XP3. 




SEQ ID NO: 


168 


is the 5' DNA sequence of XP6. 




SEQ ID NO: 


169 


is the 3' DNA sequence of XP6. 




SEQ ID NO: 


170 


is the 5' DNA sequence of XP18. 




SEQ ID NO: 


171 


is the 3' DNA sequence of XP18. 


15 


SEQ ID NO: 


172 


is the 5' DNA sequence of XP19. 




SEQ ID NO: 


173 


is the 3' DNA sequence of XP19. 




SEQ ID NO: 


174 


is the 5' DNA sequence of XP22. 




SEQ ID NO: 


175 


is the 3' DNA sequence of XP22. 




SEQ ID NO: 


176 


is the 5' DNA sequence of XP25. 


20 


SEQ ID NO: 


177 


is the 3' DNA sequence of XP25. 




SEQ ID NO: 


178 


is the full-length DNA sequence of TbH4-XPl. 




SEQ ID NO: 


179 


is the predicted amino acid sequence of ThH4-XPl 



SEQ ID NO: 180 is the predicted amino acid sequence encoded by the reverse 
complement of TbH4-XP 1 . 
25 SEQ ID NO: 181 is a first predicted amino acid sequence encoded by XP36, 

SEQ ID NO: 182 is a second predicted amino acid sequence encoded by XP36. 

SEQ ID NO: 183 is the predicted amino acid sequence encoded by the reverse 

complement of XP36. 

SEQ ID NO; 184 is the DNA sequence of RDIF2. 
30 SEQ ID NO: 185 is the DNA sequence of RDIF5. 
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SEQ ID NO: 


186 


is the 


SEQ ID NO: 


187 


is the 


SEQ ID NO: 


188 


is the 


SEQ ID NO: 


189 


is the 


SEQ ID NO: 


190 


is the 


SEQ ID NO: 


191 


is the 


SEQ ID NO: 


192 


is the 


SEQ ID NO: 


193 


is the 


SEQ ID NO: 


194 


is the 


SEQ ID NO: 


195 


is the 


SEQ ID NO: 


196 


is the 


SEQ ID NO: 


197 


is the 


SEQ ID NO: 


198 


is the 


SEQ ID NO: 


199 


is the 


SEQ ID NO: 


200-207 



protein 



containing TbRa3, 38 kD, Tb38-1 and DPEP (hereinafter referred to as TbF-2). 
SEQ ID NO: 208 is the DNA sequence of the fusion protein TbF-2. 
SEQ ID NO: 209 is the amino acid sequence of the fusion protein TbF-2. 



20 

DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is generally directed to compositions 
and methods for diagnosing tuberculosis. The compositions of the subject invention include 
polypeptides that comprise at least one antigenic portion of a M tuberculosis antigen, or a 

25 variant of such an antigen that differs only in conservative substitutions and/or modifications. 
Polypeptides within the scope of the present invention include, but are not limited to, soluble 
M tuberculosis antigens. A "soluble M tuberculosis antigen" is a protein of M tuberculosis 
origin that is present in M tuberculosis culture filtrate. As used herein, the term 
"polypeptide" encompasses amino acid chains of any length, including full length proteins 

30 antigens), wherein the amino acid residues are linked by covalent peptide bonds. Thus, 
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a polypeptide comprising an antigenic portion of one of the above antigens may consist 
entirely of the antigenic portion, or may contain additional sequences. The additional 
sequences may be derived from the native M tuberculosis antigen or may be heterologous, 
and such sequences may (but need not) be antigenic. 
5 An "antigenic portion" of an antigen (which may or may not be soluble) is a 

portion that is capable of reacting with sera obtained from an M tuberculosis-mi^ci^d 
individual (i.e., generates an absorbance reading with sera from infected individuals that is at 
least three standard deviations above the absorbance obtained with sera from uninfected 
individuals, in a representative ELISA assay described herein). An "M tuherculosis-in^QoX^d 

10 individual" is a human who has been infected with M tuberculosis {e.g., has an intradermal 
skin test response to PPD that is at least 0.5 cm in diameter). Infected individuals may 
display symptoms of tuberculosis or may be free of disease symptoms. Polypeptides 
comprising at least an antigenic portion of one or more M tuberculosis antigens as described 
herein may generally be used, alone or in combination, to detect tuberculosis in a patient. 

15 The compositions and methods of this invention also encompass variants of 

the above polypeptides, A "variant," as used herein, is a polypeptide that differs from the 
native antigen only in conservative substitutions and/or modifications, such that the antigenic 
properties of the polypeptide are retained. Such variants may generally be identified by 
modifying one of the above polypeptide sequences, and evaluating the antigenic properties of 

20 the modified polypeptide using, for example, the representative procedures described herein. 

A "conservative substitution" is one in which an amino acid is substituted for 
another amino acid that has similar properties, such that one skilled in the art of peptide 
chemistry would expect the secondary structure and hydropathic nature of the polypeptide to 
be substantially unchanged. In general, the following groups of amino acids represent 

25 conservative changes: (1) ala, pro, gly, glu, asp, gin, asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, 
ile, leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, trp, his. 

Variants may also (or alternatively) be modified by, for example, the deletion 
or addition of amino acids that have minimal influence on the antigenic properties, secondary 
structure and hydropathic nature of the polypeptide. For example, a polypeptide may be 

30 conjugated to a signal (or leader) sequence at the N-tcrminal end of the protein which co- 
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translationally or post-translationally directs transfer of the protein. The polypeptide may 
also be conjugated to a linker or other sequence for ease of synthesis, purification or 
identification of the polypeptide (e.g., poly-His), or to enhance binding of the polypeptide to a 
solid support. For example, a polypeptide may be conjugated to an immunoglobulin Fc 
5 region. 

In a related aspect, combination polypeptides are disclosed. A "combination 
polypeptide" is a polypeptide comprising at least one of the above antigenic portions and one 
or more additional antigenic M tuberculosis sequences, which are joined via a peptide 
linkage into a single amino acid chain. The sequences may be joined directly (i.e., with no 

10 intervening amino acids) or may be joined by way of a linker sequence (e.g., Gly-Cys-Gly) 
that does not significantly diminish the antigenic properties of the component polypeptides. 

In general, M tuberculosis antigens, and DNA sequences encoding such 
antigens, may be prepared using any of a variety of procedures. For example, soluble 
antigens may be isolated from M tuberculosis culture filtrate by procedures known to those 

15 of ordinary skill in the art, including anion-exchange and reverse phase chromatography. 
Purified antigens may then be evaluated for a desired property, such as the ability to react 
with sera obtained from an M tuber culosis-infQctcd individual. Such screens may be 
performed using the representative methods described herein. Antigens may then be partially 
sequenced using, for example, traditional Edman chemistry. See Edman and Berg, Eur. J. 

20 Biochem, (5^0:116-132, 1967. 

Antigens may also be produced rccombinantly using a DNA sequence that 
encodes the antigen, which has been inserted into an expression vector and expressed in an 
appropriate host. DNA molecules encoding soluble antigens may be isolated by screening an 
appropriate M tuberculosis expression library with anti-sera {e.g., rabbit) raised specifically 

25 against soluble M tuberculosis antigens. DNA sequences encoding antigens that may or may 
not be soluble may be identified by screening an appropriate M. tuberculosis genomic or 
cDNA expression library with sera obtained from patients infected with M. tuberculosis. 
Such screens may generally be performed using techniques well known in the art, such as 
those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring 

30 Harbor Laboratories, Cold Spring Harbor, NY, 1 989. 
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DNA sequences encoding soluble antigens may also be obtained by screening 
an appropriate M tuberculosis cDNA or genomic DNA library for DNA sequences that 
hybridize to degenerate oligonucleotides derived from partial amino acid sequences of 
isolated soluble antigens. Degenerate oligonucleotide sequences for use in such a screen may 
5 be designed and synthesized, and the screen may be performed, as described (for example) in 
Sambrook ct al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratories, Cold Spring Harbor, NY (and references cited therein). Polymerase chain 
reaction (PGR) may also be employed, using the above oligonucleotides in methods well 
known in the art, to isolate a nucleic acid probe from a cDNA or genomic library. The library 

1 0 screen may then be performed using the isolated probe. 

Regardless of the method of preparation, the antigens described herein are 
"antigenic." More specifically, the antigens have the ability to react with sera obtained from 
an M. tuberculosis-infected individual. Reactivity may be evaluated using, for example, the 
representative ELISA assays described herein, where an absorbance reading with sera from 

1 5 infected individuals that is at least three standard deviations above the absorbance obtained 
with sera from uninfected individuals is considered positive. 

Antigenic portions of M. tuberculosis antigens may be prepared and identified 
using well known techniques, such as those summarized in Paul, Fundamental Immunology, 
3d ed.. Raven Press, 1993, pp. 243-247 and references cited therein. Such techniques include 

20 screening polypeptide portions of the native antigen for antigenic properties. The 
representative ELISAs described herein may generally be employed in these screens. An 
antigenic portion of a polypeptide is a portion that, within such representative assays, 
generates a signal in such assays that is substantially similar to that generated by the full 
length antigen. In other words, an antigenic portion of a M tuberculosis antigen generates at 

25 least about 20%, and preferably about 100%, of the signal induced by the full length antigen 
in a model ELISA as described herein. 

Portions and other variants of M. tuberculosis antigens may be generated by 
synthetic or recombinant means. Synthetic polypeptides having fewer than about 100 amino 
acids, and generally fewer than about 50 amino acids, may be generated using techniques 

30 well known in the art. For example, such polypeptides may be synthesized using any of the 
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commercially available solid-phase techniques, such as the Merrifield solid-phase synthesis 
method, where amino acids are sequentially added to a growing amino acid chain. See 
Merrifield, J. Am. Chem. Soc. (SJ:2 149-2 146, 1963. Equipment for automated synthesis of 
polypeptides is commercially available from suppliers such as Applied BioSystems, Inc., 
5 Foster City, CA, and may be operated according to the manufacturer's instructions. Variants 
of a native antigen may generally be prepared using standard mutagenesis techniques, such as 
oiigonucleotide-directed site-specific mutagenesis. Sections of the DNA sequence may also 
be removed using standard techniques to permit preparation of truncated polypeptides. 

Recombinant polypeptides containing portions and/or variants of a native 

10 antigen may be readily prepared from a DNA sequence encoding the polypeptide using a 
variety of techniques well known to those of ordinary skill in the art. For example, 
supematants from suitable host/vector systems which secrete recombinant protein into culture 
media may be first concentrated using a commercially available filter. Following 
concentration, the concentrate may be applied to a suitable purification matrix such as an 

15 affinity matrix or an ion exchange resin. Finally, one or more reverse phase HPLC steps can 
be employed to further purify a recombinant protein. 

Any of a variety of expression vectors known to those of ordinary skill in the 
art may be employed to express recombinant polypeptides as described herein. Expression 
may be achieved in any appropriate host cell that has been transformed or transfected with an 

20 expression vector containing a DNA molecule that encodes a recombinant polypeptide. 
Suitable host cells include prokaryotes, yeast and higher eukaryotic cells. Preferably, the host 
cells employed are E. coli, yeast or a mammalian cell line, such as COS or CHO. The DNA 
sequences expressed in this manner may encode naturally occurring antigens, portions of 
naturally occurring antigens, or other variants thereof 

25 In general, regardless of the method of preparation, the polypeptides disclosed 

herein are prepared in substantially pure form. Preferably, the polypeptides are at least about 
80% pure, more preferably at least about 90% pure and most preferably at least about 99% 
pure. For use in the methods described herein, however, such substantially pure polypeptides 
may be combined. 
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In certain specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a soluble M tuberculosis antigen (or a variant of 
such an antigen), where the antigen has one of the following N-terminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-GIn- 
5 Val-Val-Ala- Ala-Leu (SEQ ID NO: 1 15); 

( b ) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID NO: 117); 

10 (d) Tyr- Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 

(SEQ ID NO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
15 NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser (SEQ ID NO: 121); 

( h) Aia-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 
(SEQ ID NO: 122); 

20 (i) Asp-Pro-Ala-Scr-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-GIn-Thr-Ser- 

Leu-Leu-Asn-Ser-Leu-AIa-Asp-Pro-Asn-Val-Scr-Phe-Ala-Asn (SEQ 
ID NO: 123); 

(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Iie-Lys-Val-Thr-Asp-Ala-Ser; 
(SEQ ID NO: 129) 

25 (k ) Ala-Gly- Asp-Thr-Xaa-Ile-Tyr-Ilc- Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQ ID NO: 130) or 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 
( SEQ ID NO: 131) 

wherein Xaa may be any amino acid, preferably a cysteine residue. A DNA sequence 
30 encoding the antigen identified as (g) above is provided in SEQ ID NO: 52, the deduced 
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amino acid sequence of which is provided in SEQ ID NO: 53. A DNA sequence encoding 
the antigen identified as (a) above is provided in SEQ ID NO: 96; its deduced amino acid 
sequence is provided in SEQ ID NO: 97. A DNA sequence corresponding to antigen (d) 
above is provided in SEQ ID NO: 24, a DNA sequence corresponding to antigen (c) is 
5 provided in SEQ ID NO: 25 and a DNA sequence corresponding to antigen (I) is disclosed in 
SEQ ID NO: 94 and its deduced amino acid sequence is provided in SEQ ID NO: 95. 

In a further specific embodiment, the subject invention discloses polypeptides 
comprising at least an immunogenic portion of an M tuberculosis antigen having one of the 
following N-terminal sequences, or a variant thereof that differs only in conservative 
10 substitutions and/or modifications: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-VaUPro-Gly-Lys-Ile- 
Asn-Val-His-Leu-Val; (SEQ ID NO: 132) or 

(n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
15 Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) 

wherein Xaa may be any amino acid, preferably a cysteine residue. 

In other specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a soluble M tuberculosis antigen (or a variant of 
such an antigen) that comprises one or more of the amino acid sequences encoded by (a) the 
20 DNA sequences of SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, (b) the complements of 
such DNA sequences, or (c) DNA sequences substantially homologous to a sequence in (a) or 
(b). 

In further specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a M tuberculosis antigen (or a variant of such an 
25 antigen), which may or may not be soluble, that comprises one or more of the amino acid 
sequences encoded by (a) the DNA sequences ofSEQ ID NOS: 26-51, 133, 134, 158-178 and 
196, (b) the complements of such DNA sequences or (c) DNA sequences substantially 
homologous to a sequence in (a) or (b). 

In the specific embodiments discussed above, the M tuberculosis antigens 
30 include variants that are encoded DNA sequences which are substantially homologous to one 
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or more of DNA sequences specifically recited herein. "Substantial homology," as used 
herein, refers to DNA sequences that are capable of hybridizing under moderately stringent 
conditions. Suitable moderately stringent conditions include prewashing in a solution of 5X 
SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing at 50°C-65°C, 5X SSC, overnight or, 
5 in the event of cross-species homology, at 45°C with 0.5X SSC; followed by washing twice 
at 65°C for 20 minutes with each of 2X, 0.5X and 0.2X SSC containing 0.1% SDS). Such 
hybridizing DNA sequences are also within the scope of this invention, as are nucleotide 
sequences that, due to code degeneracy, encode an immunogenic polypeptide that is encoded 
by a hybridizing DNA sequence. 

10 In a related aspect, the present invention provides fusion proteins comprising a 

first and a second inventive polypeptide or, alternafively, a polypeptide of the present 
invention and a known M tuberculosis antigen, such as the 38 kD antigen described above or 
ESAT-6 (SEQ ID NOS: 98 and 99), together with variants of such fusion proteins. The 
fusion proteins of the present invention may also include a linker peptide between the first 

1 5 and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 
constructed using known recombinant DNA techniques to assemble separate DNA sequences 
encoding the first and second polypeptides into an appropriate expression vector. The 3* end 
of a DNA sequence encoding the first polypeptide is ligated, with or without a peptide linker, 

20 to the 5' end of a DNA sequence encoding the second polypeptide so that the reading frames 
of the sequences are in phase to permit mRNA translation of the two DNA sequences into a 
single fusion protein that retains the biological activity of both the first and the second 
polypeptides. 

A peptide linker sequence may be employed to separate the first and the 
25 second polypeptides by a distance sufficient to ensure that each polypeptide folds into hs 
secondary and tertiary structures. Such a peptide linker sequence is incorporated into the 
fusion protein using standard techniques well known in the art. Suitable peptide linker 
sequences may be chosen based on the following factors: (1) their ability to adopt a flexible 
extended conformation; (2) their inability to adopt a secondary structure that could interact 
30 with functional epitopes on the first and second polypeptides; and (3) the lack of hydrophobic 
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or charged residues that might react with the polypeptide functional epitopes. Preferred 
peptide linker sequences contain Gly, Asn and Ser residues. Other near neutral amino acids, 
such as Thr and Ala may also be used in the linker sequence. Amino acid sequences which 
may be usefully employed as linkers include those disclosed in Maratea et al., Gene 40:39-46, 
5 1985; Murphy etal, Proa Natl Acad. Sci. USA 55:8258-8562, 1986; U.S. Patent 
No. 4,935,233 and U.S. Patent No. 4,751,180, The linker sequence may be from 1 to about 
50 amino acids in length. Peptide linker sequences are not required when the first and second 
polypeptides have non-essential N-terminal amino acid regions that can be used to separate 
the functional domains and prevent stcric hindrance. 

10 In another aspect, the present invention provides methods for using the 

polypeptides described above to diagnose tuberculosis. In this aspect, methods are provided 
for detecting M tuberculosis infection in a biological sample, using one or more of the above 
polypeptides, alone or in combination. In embodiments in which multiple polypeptides are 
employed, polypeptides other than those specifically described herein, such as the 38 kD 

15 antigen described in Andersen and Hansen, Infect. Immun. 57:2481-2488, 1989, may be 
included. As used herein, a "biological sample" is any antibody-containing sample obtained 
from a patient. Preferably, the sample is whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid or urine. More preferably, the sample is a blood, serum or plasma sample 
obtained from a patient or a blood supply. The polypeptide(s) are used in an assay, as 

20 described below, to determine the presence or absence of antibodies to the polypeptide(s) in 
the sample, relative to a predetermined cut-off value. The presence of such antibodies 
indicates previous sensitization to mycobacterial antigens which may be indicative of 
tuberculosis. 

In embodiments in which more than one polypeptide is employed, the 
25 polypeptides used are preferably complementary (re., one component polypeptide will tend 
to detect infection in samples where the infection would not be detected by another 
component polypeptide). Complementary polypeptides may generally be identified by using 
each polypeptide individually to evaluate scrum samples obtained from a series of patients 
known to be infected with M tuberculosis. After determining which samples test positive (as 
30 described below) with each polypeptide, combinations of two or more polypeptides may be 
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formulated that are capable of detecting infection in most, or all, of the samples tested. Such 
polypeptides are complementary. For example, approximately 25-30% of sera from 
tuberculosis-infected individuals are negative for antibodies to any single protein, such as the 
38 kD antigen mentioned above. Complementary polypeptides may, therefore, be used in 
5 combination with the 38 kD antigen to improve sensitivity of a diagnostic test. 

There are a variety of assay formats known to those of ordinary skill in the art 
for using one or more polypeptides to detect antibodies in a sample. See, e.g., Harlow and 
Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988, which is 
incorporated herein by reference. In a preferred embodiment, the assay involves the use of 

10 polypeptide immobihzed on a solid support to bind to and remove the antibody from the 
sample. The bound antibody may then be detected using a detection reagent that contains a 
reporter group. Suitable detection reagents include antibodies that bind to the 
antibody /poly peptide complex and free polypeptide labeled with a reporter group {e.g., in a 
semi-competitive assay). Alternatively, a competitive assay may be utilized, in which an 

15 antibody that binds to the polypeptide is labeled with a reporter group and allowed to bind to 
the immobilized antigen after incubation of the antigen with the sample. The extent to which 
components of the sample inhibit the binding of the labeled antibody to the polypeptide is 
indicative of the reactivity of the sample with the immobilized polypeptide. 

The solid support may be any solid material known to those of ordinary skill 

20 in the art to which the antigen may be attached. For example, the solid support may be a test 
well in a microliter plate or a nitrocellulose or other suitable membrane. Alternatively, the 
support may be a bead or disc, such as glass, fiberglass, latex or a plastic material such as 
polystyrene or polyvinylchloridc. The support may also be a magnetic particle or a fiber 
optic sensor, such as those disclosed, for example, in U.S. Patent No. 5,359,681. 

25 The polypeptides may be bound to the solid support using a variety of 

techniques known to those of ordinary skill in the art, which are amply described in the patent 
and scientific literature. In the context of the present invention, the term "bound" refers to 
both noncovalent association, such as adsorption, and covalent attachment (which may be a 
direct linkage between the antigen and functional groups on the support or may be a linkage 

30 by way of a cross-linking agent). Binding by adsorption to a well in a microliter plate or to a 
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membrane is preferred. In such cases, adsorption may be achieved by contacting the 
polypeptide, in a suitable buffer, with the solid support for a suitable amount of time. The 
contact time varies with temperature, but is typically between about 1 hour and 1 day. In 
general, contacting a well of a plastic microtiter plate (such as polystyrene or 
5 polyvinylchloride) with an amount of polypeptide ranging from about 10 ng to about 1 |j,g, 
and preferably about 100 ng, is sufficient to bind an adequate amount of antigen. 

Covalent attachment of polypeptide to a solid support may generally be 
achieved by first reacting the support with a bifunctional reagent that will react with both the 
support and a functional group, such as a hydroxyl or amino group, on the polypeptide. For 

10 example, the polypeptide may be bound to supports having an appropriate polymer coating 
using benzoquinone or by condensation of an aldehyde group on the support with an amine 
and an active hydrogen on the polypeptide (see, e.g.. Pierce Immunotechnology Catalog and 
Handbook, 1991, at A12-A13). 

In certain embodiments, the assay is an enzyme linked immunosorbent assay 

15 (ELISA). This assay may be performed by first contacting a polypeptide antigen that has 
been immobilized on a solid support, commonly the well of a microtiter plate, with the 
sample, such that antibodies to the polypeptide within the sample are allowed to bind to the 
immobilized polypeptide. Unbound sample is then removed from the immobilized 
polypeptide and a detection reagent capable of binding to the immobilized antibody- 

20 polypeptide complex is added. The amount of detection reagent that remains bound to the 
solid support is then determined using a method appropriate for the specific detection reagent. 

More specifically, once the polypeptide is immobilized on the support as 
described above, the remaining protein binding sites on the support are typically blocked. 
Any suitable blocking agent known to those of ordinary skill in the art, such as bovine serum 

25 albumin or Tween 20"^^^ (Sigma Chemical Co., St. Louis, MO) may be employed. The 
immobilized polypeptide is then incubated with the sample, and antibody is allowed to bind 
to the antigen. The sample may be diluted with a suitable diluent, such as phosphate-buffered 
saline (PBS) prior to incubation. In general, an appropriate contact time {i.e., incubation 
time) is that period of time that is sufficient to detect the presence of antibody within a 

30 M tuherculosis-mfecxed sample. Preferably, the contact time is sufficient to achieve a level 
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of binding that is at least 95% of that achieved at equilibrium between bound and unbound 
antibody. Those of ordinary skill in the art will recognize that the time necessary to achieve 
equilibrium may be readily determined by assaying the level of binding that occurs over a 
period of time. At room temperature, an incubation time of about 30 minutes is generally 
5 sufficient. 

Unbound sample may then be removed by washing the solid support with an 
appropriate buffer, such as PBS containing 0.1% Tween 20*^^. Detection reagent may then be 
added to the solid support. An appropriate detection reagent is any compound that binds to 
the immobilized antibody-polypeptide complex and that can be detected by any of a variety 

10 of means known to those in the art. Preferably, the detection reagent contains a binding agent 
(such as, for example, Protein A, Protein G, immunoglobulin, lectin or free antigen) 
conjugated to a reporter group. Preferred reporter groups include enzymes (such as 
horseradish peroxidase), substrates, cofactors, inhibitors, dyes, radionuclides, luminescent 
groups, fluorescent groups and biotin. The conjugation of binding agent to reporter group 

15 may be achieved using standard methods known to those of ordinary skill in the art. 
Common binding agents may also be purchased conjugated to a variety of reporter groups 
from many commercial sources (e.g., Zymed Laboratories, San Francisco, CA, and Pierce, 
Rockford, IL). 

The detection reagent is then incubated with the immobilized antibody- 
20 polypeptide complex for an amount of time sufficient to detect the bound antibody. An 
appropriate amount of time may generally be determined from the manufacturer's instructions 
or by assaying the level of binding that occurs over a period of time. Unbound detection 
reagent is then removed and bound detection reagent is detected using the reporter group. 
The method employed for detecting the reporter group depends upon the nature of the 
25 reporter group. For radioactive groups, scintillation counting or autoradiographic methods 
are generally appropriate. Spectroscopic methods may be used to detect dyes, luminescent 
groups and fluorescent groups. Biotin may be detected using avidin, coupled to a different 
reporter group (commonly a radioactive or fluorescent group or an enzyme). Enzyme 
reporter groups may generally be detected by the addition of substrate (generally for a 
30 specific period of time), followed by spectroscopic or other analysis of the reaction products. 
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To determine the presence or absence of anti-M tuberculosis antibodies in the 
sample, the signal detected from the reporter group that remains bound to the solid support is 
generally compared to a signal that corresponds to a predetermined cut-off value. In one 
preferred embodiment, the cut-off value is the average mean signal obtained when the 
5 immobilized antigen is incubated with samples from an uninfected patient. In general, a 
sample generating a signal that is three standard deviations above the predetermined cut-off 
value is considered positive for tuberculosis. In an alternate preferred embodiment, the cut- 
off value is determined using a Receiver Operator Curve, according to the method of Sackett 
et al., Clinical Epidemiology: A Basic Science for Clinical Medicine, Little Brown and Co., 

10 1985, pp. 106-107. Briefly, in this embodiment, the cut-off value may be determined from a 
plot of pairs of true positive rates {i.e., sensitivity) and false positive rates (100%-specificity) 
that correspond to each possible cut-off value for the diagnostic test result. The cut-off value 
on the plot that is the closest to the upper left-hand comer {i.e., the value that encloses the 
largest area) is the most accurate cut-off value, and a sample generating a signal that is higher 

15 than the cut-off value determined by this method may be considered positive. Alternatively, 
the cut-off value may be shifted to the left along the plot, to minimize the false positive rate, 
or to the right, to minimize the false negative rate. In general, a sample generating a signal 
that is higher than the cut-off value determined by this method is considered positive for 
tuberculosis. 

20 In a related embodiment, the assay is performed in a rapid flow-through or 

strip test format, wherein the antigen is immobilized on a membrane, such as nitrocellulose. 
In the flow-through test, antibodies within the sample bind to the immobilized polypeptide as 
the sample passes through the membrane. A detection reagent {e.g., protein A-colloidal gold) 
then binds to the antibody-polypeptide complex as the solution containing the detection 

25 reagent flows through the membrane. The detection of bound detection reagent may then be 
performed as described above. In the strip test format, one end of the membrane to which 
polypeptide is bound is immersed in a solution containing the sample. The sample migrates 
along the membrane through a region containing detection reagent and to the area of 
immobilized polypeptide. Concentration of detection reagent at the polypeptide indicates the 

30 presence of anti-M tuberculosis antibodies in the sample. Typically, the concentration of 
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detection reagent at that site generates a pattern, such as a hne, that can be read visually. The 
absence of such a pattern indicates a negative result. In general, the amount of polypeptide 
immobilized on the membrane is selected to generate a visually discernible pattern when the 
biological sample contains a level of antibodies that would be sufficient to generate a positive 
5 signal in an ELISA, as discussed above. Preferably, the amount of polypeptide immobilized 
on the membrane ranges from about 25 ng to about 1 jag, and more preferably from about 
50 ng to about 500 ng. Such tests can typically be performed with a very small amount {e.g., 
one drop) of patient serum or blood. 

Of course, numerous other assay protocols exist that are suitable for use with 
10 the polypeptides of the present invention. The above descriptions are intended to be 
exemplary only. 

In yet another aspect, the present invention provides antibodies to the 
inventive polypeptides. Antibodies may be prepared by any of a variety of techniques known 
to those of ordinary skill in the art. See, e.g., Harlow and Lane, Antibodies: A Laboratory 

15 Manual, Cold Spring Harbor Laboratory, 1988. In one such technique, an immunogen 
comprising the antigenic polypeptide is initially injected into any of a wide variety of 
mammals (e.g., mice, rats, rabbits, sheep and goats). In this step, the polypeptides of this 
invention may serve as the immunogen without modification. Alternatively, particularly for 
relatively short polypeptides, a superior immune response may be elicited if the polypeptide 

20 is joined to a carrier protein, such as bovine serum albumin or keyhole limpet hemocyanin. 
The immunogen is injected into the animal host, preferably according to a predetermined 
schedule incorporating one or more booster immunizations, and the animals are bled 
periodically. Polyclonal antibodies specific for the polypeptide may then be purified from 
such antisera by, for example, affinity chromatography using the polypeptide coupled to a 

25 suitable solid support. 

Monoclonal antibodies specific for the antigenic polypeptide of interest may 
be prepared, for example, using the technique of Kohler and Milstein, Eur. J. Immunol. 
6:51 1-519, 1976, and improvements thereto. Briefly, these methods involve the preparation 
of immortal cell lines capable of producing antibodies having the desired specificity (/'.e., 

30 reactivity with the polypeptide of interest). Such cell lines may be produced, for example, 
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from spleen cells obtained from an animal immunized as described above. The spleen cells 
are then immortalized by, for example, fusion with a myeloma cell fusion partner, preferably 
one that is syngeneic with the immunized animal. A variety of fusion techniques may be 
employed. For example, the spleen cells and myeloma cells may be combined with a 
5 nonionic detergent for a few minutes and then plated at low density on a selective medium 
that supports the growth of hybrid cells, but not myeloma cells. A preferred selection 
technique uses HAT (hypoxanthine, aminopterin, thymidine) selection. After a sufficient 
time, usually about 1 to 2 weeks, colonies of hybrids are observed. Single colonies are 
selected and tested for binding activity against the polypeptide. Hybridomas having high 

1 0 reactivity and specificity are preferred. 

Monoclonal antibodies may be isolated from the supematants of growing 
hybridoma colonies. In addition, various techniques may be employed to enhance the yield, 
such as injection of the hybridoma cell line into the peritoneal cavity of a suitable vertebrate 
host, such as a mouse. Monoclonal antibodies may then be harvested from the ascites fluid or 

15 the blood. Contaminants may be removed from the antibodies by conventional techniques, 
such as chromatography, gel filtration, precipitation, and extraction. The polypeptides of this 
invention may be used in the purification process in, for example, an affinity chromatography 
step. 

Antibodies may be used in diagnostic tests to detect the presence of 
20 M. tuberculosis antigens using assays similar to those detailed above and other techniques 
well known to those of skill in the art, thereby providing a method for detecting 
M tuberculosis infection in a patient. 

Diagnostic reagents of the present invention may also comprise DNA 
sequences encoding one or more of the above polypeptides, or one or more portions thereof. 
25 For example, at least two oligonucleotide primers may be employed in a polymerase chain 
reaction (PGR) based assay to amplify M tuberculosis-s.^Qc\i\c cDNA derived from a 
biological sample, wherein at least one of the oligonucleotide primers is specific for a DNA 
molecule encoding a polypeptide of the present invention. The presence of the amplified 
cDNA is then detected using techniques well known in the art, such as gel electrophoresis. 
30 Similarly, oligonucleotide probes specific for a DNA molecule encoding a polypeptide of the 
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present invention may be used in a hybridization assay to detect the presence of an inventive 
polypeptide in a biological sample. 

As used herein, the term "oligonucleotide primer/probe specific for a DNA 
molecule" means an oligonucleotide sequence that has at least about 80%, preferably at least 
5 about 90% and more preferably at least about 95%, identity to the DNA molecule in question. 
Oligonucleotide primers and/or probes which may be usefully employed in the inventive 
diagnostic methods preferably have at least about 10-40 nucleotides. In a preferred 
embodiment, the oligonucleotide primers comprise at least about 10 contiguous nucleotides 
of a DNA molecule encoding one of the polypeptides disclosed herein. Preferably, 

10 oligonucleotide probes for use in the inventive diagnostic methods comprise at least about 15 
contiguous oligonucleotides of a DNA molecule encoding one of the polypeptides disclosed 
herein. Techniques for both PCR based assays and hybridization assays are well known in 
the art (see, for example, Mullis et al. Ibid; Ehrlich, Ibid). Primers or probes may thus be 
used to detect M tuberculosis-siptc\f\c sequences in biological samples. DNA probes or 

15 primers comprising oligonucleotide sequences described above may be used alone, in 
combination with each other, or with previously identified sequences, such as the 38 kD 
antigen discussed above. 

The following Examples are offered by way of illustration and not by way of 

20 limitation. 

EXAMPLES 



EXAMPLE 1 

25 Purification and Characterization of Polypeptides 

FROMM tuberculosis C\}LlVK\i¥\LTKMYl 



This example illustrates the preparation of M. tuberculosis soluble 
polypeptides from culture filtrate. Unless otherwise noted, all percentages in the following 
30 example are w^eight per volume. 
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M. tuberculosis (either H37Ra, ATCC No. 25177, or H37Rv, ATCC 
No. 25618) was cultured in sterile GAS media at 37°C for fourteen days. The media was 
then vacuum filtered (leaving the bulk of the cells) through a 0.45 [x filter into a sterile 2.5 L 
bottle. The media was then filtered through a 0.2 |i filter into a sterile 4 L bottle. NaN3 was 
5 then added to the culture filtrate to a concentration of 0.04%. The bottles were then placed in 
a 4''C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L reservoir 
that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell which had 
been rinsed with ethanol and contained a 10,000 kDa MWCO membrane. The pressure was 
10 maintained at 60psi using nitrogen gas. This procedure reduced the 12 L volume to 
approximately 50 ml. 

The culture filtrate was then dialyzed into 0.1% ammonium bicarbonate using 
a 8,000 kDa MWCO cellulose ester membrane, with two changes of ammonium bicarbonate 
solution. Protein concentration was then determined by a commercially available BCA assay 
. 15 (Pierce, Rockford, IL). 

The dialyzed culture filtrate was then lyophilized, and the polypeptides 
resuspended in distilled water. The polypeptides were then dialyzed against 0.01 mM 1,3 
bis[tris(hydroxymethyl)-methylamino]propane, pH 7.5 (Bis-Tris propane buffer), the initial 
conditions for anion exchange chromatography. Fractionation was performed using gel 
20 profusion chromatography on a POROS 146 II Q/M anion exchange column 4.6 mm x 
100 mm (Perseptive BioSystems, Framingham, MA) equilibrated in 0.01 mM Bis-Tris 
propane buffer pH 7.5. Polypeptides were eluted with a linear 0-0.5 M NaCl gradient in the 
above buffer system. The column eluent was monitored at a wavelength of 220 nm. 

The pools of polypeptides eluting from the ion exchange column were 
25 dialyzed against distilled water and lyophilized. The resuhing material was dissolved in 0.1% 
trifluoroacetic acid (TEA) pH 1.9 in water, and the polypeptides were purified on a Delta-Pak 
C18 column (Waters, Milford, MA) 300 Angstrom pore size, 5 micron particle size (3.9 x 
150 mm). The polypeptides were eluted from the column with a linear gradient from 0-60% 
dilution buffer (0.1% TFA in acetonitrile). The flow rate was 0.75 ml/minute and the IIPLC 
30 eluent was monitored at 214 nm. Fractions containing the ckitcd polypeptides were collected 
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to maximize the purity of the individual samples. Approximately 200 purified polypeptides 
were obtained. 

The purified polypeptides were then screened for the ability to induce T-cell 
proliferation in PBMC preparations. The PBMCs from donors known to be PPD skin test 
5 positive and whose T cells were shown to proliferate in response to PPD and crude soluble 
proteins from MTB were cultured in medium comprising RPMI 1640 supplemented with 
10% pooled human serum and 50 ^g/ml gentamicin. Purified polypeptides were added in 
duplicate at concentrations of 0.5 to 10 ng/mL. After six days of culture in 96-well round- 
bottom plates in a volume of 200 ]xl 50 fil of medium was removed from each well for 

10 determination of IFN-y levels, as described below. The plates were then pulsed with 
1 laCi/well of tritiated thymidine for a further 18 hours, harvested and tritium uptake 
determined using a gas scintillation counter. Fractions that resulted in proliferation in both 
replicates three fold greater than the proliferation observed in cells cultured in medium alone 
were considered positive. 

15 IFN-Y was measured using an enzyme-linked immunosorbent assay (ELISA). 

ELISA plates were coated with a mouse monoclonal antibody directed to human IFN-y 
(Chemicon) in PBS for four hours at room temperature. Wells were then blocked with PBS 
containing 5% (WA^) non-fat dried milk for 1 hour at room temperature. The plates were 
then washed six times in PBS/0.2% TWEEN-20 and samples diluted 1 :2 in culture medium 

20 in the ELISA plates were incubated overnight at room temperature. The plates were again 
washed and a polyclonal rabbit anti-human IFN-y serum diluted 1:3000 in PBS/10% normal 
goat serum was added to each well. The plates were then incubated for two hours at room 
temperature, washed and horseradish peroxidase-coupled anti-rabbit IgG (Jackson Labs.) was 
added at a 1 :2000 dilution in PBS/5% non-fat dried milk. After a further two hour incubation 

25 at room temperature, the plates were washed and TMB substrate added. The reaction was 
stopped after 20 min with 1 N sulfuric acid. Optical density was determined at 450 nm using 
570 nm as a reference wavelength. Fractions that resulted in both replicates giving an OD 
two fold greater than the mean OD from cells cultured in medium alone, plus 3 standard 
deviations, were considered positive. 
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For sequencing, the polypeptides were individually dried onto 
Biobrene'^'^ (Perkin Elmer/Applied BioSystems Division, Foster City, CA) treated glass fiber 
filters. The filters with polypeptide were loaded onto a Perkin Elmer/ Applied BioSystems 
Division Procise 492 protein sequencer. The polypeptides were sequenced from the amino 
5 terminal and using traditional Edman chemistry. The amino acid sequence was determined 
for each polypeptide by comparing the retention time of the PTH amino acid derivative to the 
appropriate PTH derivative standards. 

Using the procedure described above, antigens having the following 
N-terminal sequences were isolated: 
10 (a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Xaa-Asn-Tyr-Gly-Gln- 

Val-Val-Ala-Ala-Leu (SEQ ID NO: 54); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 55); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
1 5 Ly s-Glu-GIy-Arg (SEQ ID NO: 56); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID NO: 57); 

(e) Asp-IIe-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-AIa-Val (SEQ ID 
NO: 58); 

20 (f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 

NO: 59); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Ala- Ala-Ala-Ala-Pro-Pro- 
Ala (SEQ ID NO: 60); and 

(h) Ala-Pro-Lys-lhr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 
25 (SEQ ID NO: 61); 

wherein Xaa may be any amino acid. 

An additional antigen was isolated employing a microbore HPLC purification 
step in addition to the procedure described above. Specifically, 20 fil of a fraction comprising 
a mixture of antigens from the chromatographic purification step previously described, was 
30 purified on an Aquapore CI 8 column (Perkin Elmer/Applied Biosystems Division, Foster 
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City, CA) with a 7 micron pore size, column size 1 mm x 100 mm, in a Perkin Elmer/ Applied 
Biosystems Division Model 172 HPLC. Fractions were eluted from the column with a linear 
gradient of 1%/minute of acetonitrilc (containing 0.05% TFA) in water (0.05% TP A) at a 
flow rate of 80 |il/minute. The eluent was monitored at 250 nm. The original fraction was 
5 separated into 4 major peaks plus other smaller components and a polypeptide was obtained 
which was shown to have a molecular weight of 12.054 Kd (by mass spectrometry) and the 
following N-terminal sequence: 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Gln-Thr-Ser- 
Lcu-Leu-Asn-Asn-Leu-Ala-Asp-Pro-Asp-Val-Ser-Phe-Ala-Asp (SEQ 

10 ID NO: 62). 

This polypeptide was shown to induce proliferation and IFN-y production in PBMC 
preparations using the assays described above. 

Additional soluble antigens were isolated from A/, tuberculosis culture filtrate 
as follows. M tuberculosis culture filtrate was prepared as described above. Following 

15 dialysis against Bis-Tris propane buffer, at pH 5,5, fractionation was performed using anion 
exchange chromatography on a Poros QE column 4.6 x 100 mm (Perseptive Biosystems) 
equilibrated in Bis-Tris propane buffer pH 5.5. Polypeptides were eluted with a linear 0-1.5 
M NaCl gradient in the above buffer system at a flow rate of 10 ml/min. The column eluent 
was monitored at a wavelength of 214 nm. 

20 The fractions elating from the ion exchange column were pooled and 

subjected to reverse phase chromatography using a Poros R2 column 4.6 x 100 mm 
(Perseptive Biosystems). Polypeptides were eluted from the column with a linear gradient 
from 0-100% acetonitrilc (0.1% TFA) at a flow rate of 5 ml/min. The eluent was monitored 
at 214 nm. 

25 Fractions containing the eluted polypeptides were lyophilized and resuspended 

in 80 |il of aqueous 0.1% TFA and further subjected to reverse phase chromatography on a 
Vydac C4 column 4.6 x 150 mm (Western Analytical, Temecula, CA) with a linear gradient 
of 0-100% acetonitrilc (0.1% TFA) at a flow rate of 2 ml/min. Eluent was monitored at 214 
nm. 
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The fraction with biological activity was separated into one major peak plus 
other smaller components. Western blot of this peak onto PVDF membrane revealed three 
major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These polypeptides were 
determined to have the following N-terminal sequences, respectively: 
5 0) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-rhr-Ile-Lys-Val-Thr-Asp-Ala-Ser; 

(SEQIDNO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQ ID NO: 130) and 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 
10 (SEQ ID NO: 131), wherein Xaa may be any amino acid. 

Using the assays described above, these polypeptides were shown to induce proliferation and 
IFN-Y production in PBMC preparations. Figs. 1 A and B show the results of such assays 
using PBMC preparations from a first and a second donor, respectively. 

DNA sequences that encode the antigens designated as (a), (c), (d) and (g) 
1 5 above were obtained by screening a M tuberculosis genomic library using ^^P end labeled 
degenerate oligonucleotides corresponding to the N-terminal sequence and containing 
M tuberculosis codon bias. The screen performed using a probe corresponding to antigen (a) 
above identified a clone having the sequence provided in SEQ ID NO: 96. The polypeptide 
encoded by SEQ ID NO: 96 is provided in SEQ ID NO: 97. The screen performed using a 
20 probe corresponding to antigen (g) above identified a clone having the sequence provided in 
SEQ ID NO: 52. The polypeptide encoded by SEQ ID NO: 52 is provided in SEQ ID 
NO: 53. The screen performed using a probe corresponding to antigen (d) above identified a 
clone having the sequence provided in SEQ ID NO: 24, and the screen performed with a 
probe corresponding to antigen (c) identified a clone having the sequence provided in SEQ ID 
25 NO: 25. 

The above amino acid sequences were compared to known amino acid 
sequences in the gene bank using the DNA STAR system. The database searched contains 
some 173,000 proteins and is a combination of the Swiss, PIR databases along with translated 
protein sequences (Version 87). No significant homologies to the amino acid sequences for 
30 antigens (a)-(h) and (1) were detected. 
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The amino acid sequence for antigen (i) was found to be homologous to a 
sequence from M. leprae. The full length M leprae sequence was amplified from genomic 
DNA using the sequence obtained from GENBANK. This sequence was then used to screen 
an M tuberculosis library and a full length copy of the M tuberculosis homologue was 
5 obtained (SEQ ID NO: 94). 

The amino acid sequence for antigen (j) was found to be homologous to a 
known M tuberculosis protein translated from a DNA sequence. To the best of the 
inventors' knowledge, this protein has not been previously shown to possess T-cell 
stimulatory activity. The amino acid sequence for antigen (k) was found to be related to a 
10 sequence from M leprae. 

In the proliferation and IFN-y assays described above, using three PPD 
positive donors, the results for representative antigens provided above are presented in Table 
1: 

15 TABLE 1 

Results of PBMC Proliferation and IFN-y Assays 



Sequence 


Proliferation 


IFN-y 


fa) 


+ 




(c) 




-HH- 


(d) 


+4- 


++ 


(g) 


+++ 


-H-+ 


(h) 




+++ 



In Table 1, responses that gave a stimulation index (SI) of between 2 and 4 
20 (compared to cells cultured in medium alone) were scored as +, as SI of 4-8 or 2-4 at a 
concentration of 1 yig or less was scored as ++ and an SI of greater than 8 was scored as 
The antigen of sequence (i) was found to have a high SI (+++) for one donor and lower SI 
(++ and +) for the two other donors in both proliferation and IFN-y assays. These results 
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indicate that these antigens arc capable of inducing proliferation and/or interferon-y 
production. 

EXAMPLE 2 

5 Use Of Patient Sera To Isolate M Tuberculosis Antige-ns 

This example illustrates the isolation of antigens from M tuberculosis lysate 
by screening with scrum from M tuberculosis-mfectQd individuals. 

Dessicated M tuberculosis H37Ra (Difco Laboratories) was added to a 2% 
10 NP40 solution, and alternately homogenized and sonicated three times. The resulting 
suspension was centrifuged at 13,000 rpm in micro fuge tubes and the supernatant put through 
a 0,2 micron syringe filter. The filtrate was bound to Macro Prep DEAE beads (BioRad, 
Flercules, CA). The beads were extensively washed with 20 mM Tris pH 7.5 and bound 
proteins eluted with IM NaCl. The NaCl elute was dialyzed overnight against 10 mM Tris, 
15 pH 7.5. Dialyzed solution was treated with DNase and RNase at 0.05 mg/ml for 30 min. at 
room temperature and then with a-D-mannosidase, 0.5 U/mg at pH 4.5 for 3-4 hours at room 
temperature. After returning to pH 7.5, the material was fractionated via FPLC over a Bio 
Scale-Q-20 column (BioRad). Fractions were combined into nine pools, concentrated in a 
Centriprep 10 (Amicon, Beverley, MA) and screened by Western blot for serological activity 
20 using a serum pool from M tuberculosisAnfeclGd patients which was not immunoreactive 
with other antigens of the present invention. 

The most reactive fraction was run in SDS-PAGE and transferred to PVDF. A 
band at approximately 85 Kd was cut out yielding the sequence: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Cily-Ile-Val-Pro-Gly-Lys-Ile- 
25 Asn-Val-His-Leu-Val; (SEQ ID NO: 132), wherein Xaa may be any 

amino acid. 

Comparison of this sequence with those in the gene bank as described above, 
revealed no significant homologies to known sequences. 

A DNA sequence that encodes the antigen designated as (m) above was 
30 obtained by screening a genomic M tuberculosis Erdman strain library using labeled 
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degenerate oligonucleotides corresponding to the N-terminal sequence of SEQ ID NO: 137. A 
clone was identified having the DNA sequence provided in SEQ ID NO: 198. This sequence 
was found to encode the amino acid sequence provided in SEQ ID NO: 199. Comparison of 
these sequences with those in the genebank revealed some similarity to sequences previously 
5 identified in M tuberculosis and M. bovis, 

EXAMPLE 3 

Preparation of DNA Sequences Encoding M. tuberculosis AmiGEm 

10 This example illustrates the preparation of DNA sequences encoding 

M. tuberculosis antigens by screening a M tuberculosis expression library with sera obtained 
from patients infected with M tuberculosis, or with anti-sera raised against M. tuberculosis 
antigens. 

15 A. Preparation of M. tuberculosis Soluble Antigens using Rabbit Anti-sera 
Raised against M. tuberculosis Supernatant 

Genomic DNA was isolated from the M tuberculosis strain H37Ra. The DNA 
was randomly sheared and used to construct an expression library using the Lambda ZAP 
expression system (Stratagene, La Jolla, CA). Rabbit anti-sera was generated against 

20 secretory proteins of the M tuberculosis strains H37Ra, H37Rv and Erdman by immunizing a 
rabbit with concentrated supernatant of the M tuberculosis cultures. Specifically, the rabbit 
was first immunized subcutaneously with 200 ^ig of protein antigen in a total volume of 2 ml 
containing 100 jig muramyl dipeptide (Calbiochem, La Jolla, CA) and 1 ml of incomplete 
Frcund's adjuvant. Four weeks later the rabbit was boosted subcutaneously with 100 fag 

25 antigen in incomplete Freund's adjuvant. Finally, the rabbit was immunized intravenously 
four weeks later with 50 jig protein antigen. The anti-sera were used to screen the expression 
library as described in Sambrook etal, Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989. Bacteriophage plaques 
expressing immunoreactive antigens were purified. Phagemid from the plaques was rescued 

30 and the nucleotide sequences of the M. tuberculosis clones deduced. 
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Thirty two clones were purified. Of these, 25 represent sequences that have 
not been previously identified in M tuberculosis. Proteins were induced by IPTG and 
purified by gel elution, as described in Skeiky etal, J. Exp. Med 181:\521-\531, 1995. 
Representative partial sequences of DNA molecules identified in this screen are provided in 
5 SEQ ID NOS: 1-25. The corresponding predicted amino acid sequences are shown in SEQ 
ID NOS: 64-88. 

On comparison of these sequences with known sequences in the gene bank 
using the databases described above, it was found that the clones referred to hereinafter as 
TbRA2A, TbRA16, TbRA18, and TbRA29 (SEQ ID NOS: 77, 69, 71, 76) show some 

10 homology to sequences previously identified in Mycobacterium leprae but not in 
M. tuberculosis. TbRAll, TbRA26, TbRA28 and TbDPEP (SEQ ID NOS: 66, 74, 75, 53) 
have been previously identified in M. tuberculosis. No significant homologies were found to 
TbRAl, TbRA3, TbRA4, TbRA9, TbRAlO, TbRA13, TbRA17, TbRA19, TbRA29, 
TbRA32, TbRA36 and the overlapping clones TbRA35 and TbRA12 (SEQ ID NOS: 64, 78, 

15 82, 83, 65, 68, 76, 72, 76, 79, 81, 80, 67, respectively). The clone TbRa24 is overlapping 
with clone TbRa29. 

B. Use of Sera from Patients having Pulmonary or Pleural Tuberculosis to 
Identify DNA Sequences Encoding A/, tuberculosis /kkyxge^s 

20 The genomic DNA library described above, and an additional H37Rv library, 

were screened using pools of sera obtained from patients with active tuberculosis. To prepare 
the H37Rv librar>% M tuberculosis strain H37Rv genomic DNA was isolated, subjected to 
partial Sau3A digestion and used to construct an expression library using the Lambda Zap 
expression system (Stratagene, La Jolla, Ca). Three different pools of sera, each containing 

25 sera obtained from three individuals with active pulmonary or pleural disease, were used in 
the expression screening. The pools were designated TbL, TbM and TbH, referring to 
relative reactivity v/ith H37Ra lysate {i.e., TbL - low reactivity, TbM = medium reactivity 
and TbH = high reactivity) in both LLISA and immunoblot format. A fourth pool of sera 
from seven patients with active pulmonary tuberculosis was aiso employed. All of the sera 
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lacked increased reactivity with the recombinant 38 kD M tuberculosis H37Ra phosphate- 
binding protein. 

All pools were pre-adsorbed with E, coli lysate and used to screen the H37Ra 
and H37Rv expression libraries, as described in Sambrook etal., Molecular Cloning: A 
5 Laboratory' Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989. 
Bacteriophage plaques expressing immunoreactivc antigens were purified. Phagemid from 
the plaques was rescued and the nucleotide sequences of the M tuberculosis clones deduced. 

Thirty two clones were purified. Of these, 3 1 represented sequences that had 
not been previously identified in human M tuberculosis. Representative sequences of the 

10 DNA molecules identified are provided in SEQ ID NOS:: 26-51 and 100. Of these, TbH-8-2 
(SEQ. ID NO. 100) is a partial clone of TbH-8, and TbH-4 (SEQ, ID NO. 43) and TbH-4- 
FWD (SEQ. ID NO. 44) are non-contiguous sequences from the same clone. Amino acid 
sequences for the antigens hereinafter identified as Tb38-1, TbH-4, TbH-8, TbH-9, and 
TbIM2 are shown in SEQ ID NOS.: 89-93. Comparison of these sequences with known 

15 sequences in the gene bank using the databases identified above revealed no significant 
homologies to TbH-4, TbH-8, TbH-9 and TbM-3, although weak homologies were found to 
TbH-9. TbH-12 was found to be homologous to a 34 kD antigenic protein previously 
identified in M. paratuberculosis (Acc. No. S28515). Tb38-1 was found to be located 34 
base pairs upstream of the open reading frame for the antigen ESAT-6 previously identified 

20 in M.bovis (Acc. No. U34848) and in M. tuberculosis (Sorensen etal., Infec. Immun. 
63:1710-1717, 1995). 

Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra library, 
were used to identify clones in an H37Rv library. Tb38-1 hybridized to Tb38-1F2, Tb38- 
lF3,Tb38-lF5 and Tb38-1F6 (SEQ. ID NOS: 107, 108, 111, 113, and 114). (SEQ ID NOS: 

25 107 and 108 are non-contiguous sequences from clone Tb38-1F2.) 1 wo open reading frames 
were deduced in Tb38-IF2; one corresponds to Tb37FL (SEQ. ID. NO. 109), the second, a 
partial sequence, may be the homologue of Tb38-1 and is called Tb38-IN (SEQ. ID NO. 110). 
The deduced amino acid sequence of rb38-lF3 is presented in SEQ. ID. NO. 112. A TbH-9 
probe identified three clones in the II37Rv library: TbH-9-FL (SEQ. ID NO. 101), which 

30 may be the homologue of TbH-9 (R37Ra), TbH-9-1 (SEQ. ID NO. 103), and TbH-8-2 (SEQ. 
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ID NO. 105) is a partial clone of TbH-8. The deduced amino acid sequences for these three 
clones are presented in SEQ ID NOS: 102, 104 and 106. 

Further screening of the M tuberculosis genomic DNA library, as described 
above, resulted in the recovery of ten additional reactive clones, representing seven different 
5 genes. One of these genes was identified as the 38 Kd antigen discussed above, one was 
determined to be identical to the 14Kd alpha crystallin heat shock protein previously shown 
to be present in M tuberculosis, and a third was determined to be identical to the antigen 
TbH-8 described above. The determined DNA sequences for the remaining five clones 
(hereinafter referred to as TbH-29, TbH-30, TbH-32 and TbH-33) are provided in SEQ ID 

10 NO: 133-136, respectively, with the corresponding predicted amino acid sequences being 
provided in SEQ ID NO: 137-140, respectively. The DNA and amino acid sequences for 
these antigens were compared with those in the gene bank as described above. No 
homologies were found to the 5' end of TbH-29 (which contains the reactive open reading 
frame), although the 3' end of TbH-29 was found to be identical to the M tuberculosis 

15 cosmid Y227. TbH-32 and TbH-33 were found to be identical to the previously identified 
M tuberculosis insertion element IS6110 and to the M tuberculosis cosmid Y50, 
respectively. No significant homologies to TbH-30 were found. 

Positive phagemid from this additional screening were used to infect £ coli 
XL-1 Blue MRF', as described in Sambrook et al., supra. Induction of recombinant protein 

20 was accomplished by the addition of IPTG. Induced and uninduced lysates were run in 
duplicate on SDS-PAGE and transferred to nitrocellulose filters. Filters were reacted with 
human M tuberculosis sera (1:200 dilution) reactive with Tbll and a rabbit sera (1:200 or 
1:250 dilution) reactive with the N-terminal 4 Kd portion of lacZ. Sera incubations were 
performed for 2 hours at room temperature. Bound antibody was detected by addition of '"I- 

25 labeled Protein A and subsequent exposure to film for variable times ranging from 16 hours 
to 1 1 days. The results of the immunoblots arc summarized in Table 2. 
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Antigen 

5 

TbH-29 
TbH-30 
TbH-32 
TbH-33 

10 

Positive reaction of the recombinant human M tuberculosis antigens with both 
the human M tuberculosis sera and anti-lacZ sera indicate that reactivity of the human M. 
tuberculosis sera is directed low^ards the fusion protein. Antigens reactive with the anti-lacZ 
sera but not with the human M tuberculosis sera may be the result of the human M. 

15 tuberculosis sera recognizing conformational epitopes, or the antigen-antibody binding 
kinetics may be such that the 2 hour sera exposure in the immunoblot is not sufficient. 

Studies were undertaken to determine whether the antigens TbH-9 and Tb38-1 
represent cellular proteins or are secreted into M tuberculosis culture media. In the first 
study, rabbit sera were raised against A) secretory proteins of M tuberculosis, B) the known 

20 secretory recombinant M. tuberculosis antigen 85b, C) recombinant Tb38-1 and D) 
recombinant TbII-9, using protocols substantially as described in Example 3A. lotal M. 
tuberculosis lysate, concentrated supernatant of M tuberculosis cultures and the recombinant 
antigens 85b, TbH-9 and Tb38-1 were resolved on denaturing gels, immobilized on 
nitrocellulose membranes and duplicate blots were probed using the rabbit sera described 

25 above. 

The results of this analysis using control sera (panel 1) and antiscra (panel 11) 
against secretory proteins, recombinant 85b, recombinant Tb38-1 and recombinant TbH-9 are 
shown in Figures 2A-D, respectively, wherein the lane designations are as follows: 1) 
molecular weight protein standards; 2) 5 f^g of M. tuberculosis lysate; 3) 5 ^g secretory 
30 proteins; 4) 50 ng recombinant Tb38-1; 5) 50 ng recombinant TbH-9; and 6) 50 ng 
recombinant 85b. The recombinant antigens were engineered with six terminal histidine 



40 

TABLE 2 

Human M. tb Anti-lacZ 
Sera Sera 

45 Kd 45 Kd 

No reactivity 29 Kd 

12 Kd 12 Kd 

16 Kd 16 Kd 
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residues and would therefore be expected to migrate with a mobility approximately 1 kD 
larger that the native protein. In Figure 2D, recombinant TbH-9 is lacking approximately 10 
kD of the full-length 42 kD antigen, hence the significant difference in the size of the 
immunoreactive native TbH-9 antigen in the lysate lane (indicated by an arrow). These 
5 results demonstrate that Tb38-1 and TbH-9 are intracellular antigens and are not actively 
secreted by M tuberculosis. 

The finding that TbH-9 is an intracellular antigen was confirmed by 
determining the reactivity of TbH-9-specific human T cell clones to recombinant TbH-9, 
secretory M tuberculosis proteins and PPD. A TbH-9-specific T cell clone (designated 

10 131TbH-9) was generated from PBMC of a healthy PPD-positive donor. The proliferative 
response of 131TbH-9 to secretory proteins, recombinant TbH-9 and a control M. 
tuberculosis antigen, TbRal 1, was determined by measuring uptake of tritiatcd thymidine, as 
described in Example 1. As shown in Figure 3 A, the clone 131 TbH-9 responds specifically 
to TbH-9, showing that TbH-9 is not a significant component of M tuberculosis secretory 

15 proteins. Figure 3B shows the production of IFN-y by a second TbH-9-specific T cell clone 
(designated PPD 800-10) prepared from PBMC from a healthy PPD-positive donor, 
following stimulation of the T cell clone with secretory proteins, PPD or recombinant TbH-9. 
These results further confirm that TbH-9 is not secreted by M tuberculosis, 

20 C. Use of Sera From Patients having Extrapulmonary Tuberculosis to Identify 
DNA Sequences Encoding M tuberculosis AmxGE^s 

Genomic DNA was isolated from M tuberculosis Erdman strain, randomly 
sheared and used to construct an expression library employing the Lambda ZAP expression 
25 system (Stratagcnc, La Jolla, CA). The resulting library was screened using pools of sera 
obtained from individuals with extrapulmonary tuberculosis, as described above in Example 
3B, with the secondary antibody being goat anti-human IgG + A + M (H+L) conjugated with 
alkaline phosphatase. 

Eighteen clones were purified. Of these, 4 clones (hereinafter referred to as 
30 XP14, XP24, XP31 and XP32) were found to bear some similarity to known sequences. The 
determined DNA sequences for XP14, XP24 and XP31 are provided in SEQ ID NOS: 151- 
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153, respectively, with the 5' and 3' DNA sequences for XP32 being provided in SEQ ID 
NOS: 154 and 155, respectively. The predicted amino acid sequence for XP14 is provided in 
SEQ ID NO: 156. The reverse complement of XP14 was found to encode the amino acid 
sequence provided in SEQ ID NO: 157. 
5 Comparison of the sequences for the remaining 14 clones (hereinafter referred 

to as XP1-XP6, XP17-XP19, XP22, XP25, XP27, XP30 and XP36) with those in the 
genebank as described above, revealed no homologies with the exception of the 3 ' ends of 
XP2 and XP6 which were found to bear some homology to known M tuberculosis cosmids. 
The DNA sequences for XP27 and XP36 are shown in SEQ ID NOS: 158 and 159, 

10 respectively, with the 5' sequences for XP4, XP5, XP17 and XP30 being shown in SEQ ID 
NOS: 160-163, respectively, and the 5' and 3' sequences for XP2, XP3, XP6, XP18, XP19, 
XP22 and XP25 being shown in SEQ ID NOS: 164 and 165; 166 and 167; 168 and 169; 170 
and 171; 172 and 173; 174 and 175; and 176 and 177, respectively. XPl was found to 
overlap with the DNA sequences for TbH4, disclosed above. The full-length DNA sequence 

15 for TbH4-XPl is provided in SEQ ID NO: 178. This DNA sequence was found to contain an 
open reading frame encoding the amino acid sequence shown in SEQ ID NO; 179. The 
reverse complement of TbH4-XPl was found to contain an open reading frame encoding the 
amino acid sequence shown in SEQ ID NO: 1 80. The DNA sequence for XP36 was found to 
contain two open reading frames encoding the amino acid sequence shovm in SEQ ID NOS: 

20 181 and 182, with the reverse complement containing an open reading frame encoding the 
amino acid sequence shown in SEQ ID NO: 1 83. 

Recombinant XPl protein was prepared as described above in Example 3B, 
with a metal ion affinity chromatography column being employed for purification. 
Recombinant XPl was found to stimulate cell proliferation and IFN-y production in T cells 

25 isolated from an M tuberculosis-immunc donors. 

D. Preparation oh M. tuberculosis Soluble Antigens using Rabbit Anti-sera 

RAISED AGAINST M. TUBERCUI ,OSIS FRACTIONATED PROTEINS 

M tuberculosis lysate was prepared as described above in Example 2. The 
30 resulting material was fractionated by HPLC and the fractions screened by Western blot for 
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serological activity with a serum pool from M tuberculosis-'mkcied patients which showed 
little or no immunoreactivity with other antigens of the present invention. Rabbit anti-sera 
was generated against the most reactive fraction using the method described in Example 3A . 
The anti-sera was used to screen an M tuberculosis Erdman strain genomic DNA expression 
5 library prepared as described above. Bacteriophage plaques expressing immunorcactive 
antigens were purified. Phagemid from the plaques was rescued and the nucleotide sequences 
of the M tuberculosis clones determined. 

Ten different clones were purified. Of these, one was found to be TbRa35, 
described above, and one was found to be the previously identified M tuberculosis antigen, 

10 HSP60. Of the remaining eight clones, six (hereinafter referred to as RDIF2, RDIF5, RDIF8, 
RDIFIO, RDIFll and RDIF12) were found to bear some similarity to previously identified 
M. tuberculosis sequences. The determined DNA sequences for RD1F2, RDIF5, RDIF8, 
RDIFIO and RDIFll are provided in SEQ ID NOS: 184-188, respectively, with the 
corresponding predicted amino acid sequences being provided in SEQ ID NOS; 189-193, 

15 respectively. The 5* and 3' DNA sequences for RDIF12 are provided in SEQ ID NOS: 194 
and 195, respectively. No significant homologies were found to the antigen RDIF-7. The 
determined DNA and predicted amino acid sequences for RDIF7 are provided in SEQ ID 
NOS: 196 and 197, respectively. One additional clone, referred to as RDIF6 was isolated, 
however, this was found to be identical to RDIF5. 

20 Recombinant RDIF6, RDIF8, RDIFIO and RDIFll were prepared as 

described above. These antigens were found to stimulate cell proliferation and IFN-y 
production in T cells isolated from M tuberculosis-immune donors. 



25 EXAMPLE 4 

Purification and Characterization of a Polypeptide from Tuberculin Purified 

Protein Derivative 



An M tuberculosis polypeptide was isolated from tuberculin purified protein 
30 derivative (PPD) as follows. 
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PPD was prepared as published with some modification (Seibert, F. et al., 
Tuberculin purified protein derivative. Preparation and analyses of a large quantity for 
standard. The American Review of Tuberculosis 44:9-25, 1941). M tuberculosis Rv strain 
was grown for 6 weeks in synthetic medium in roller bottles at 37°C. Bottles containing the 

5 bacterial growth were then heated to 1 00°C in water vapor for 3 hours. Cultures were sterile 
filtered using a 0.22 [x filter and the liquid phase was concentrated 20 times using a 3 kD cut- 
off membrane. Proteins were precipitated once with 50% ammonium sulfate solution and 
eight times with 25% ammonium sulfate solution. The resulting proteins (PPD) were 
fractionated by reverse phase liquid chromatography (RP-HPLC) using a CI 8 column (7.8 x 

10 300 mM; Waters, Milford, MA) in a Biocad HPLC system (Perseptive Biosystems, 
Framingham, MA). Fractions were eluted from the column with a linear gradient from 0- 
100% buffer (0.1% TFA in acctonitrile). The flow rate was 10 ml/minute and eluent was 
monitored at 214 nm and 280 nm. 

Six fractions were collected, dried, suspended in PBS and tested individually 

15 in M tuberculosis-infected guinea pigs for induction of delayed type hypersensitivity (DTH) 
reaction. One fraction was found to induce a strong DTH reaction and was subsequently 
fractionated further by RP-HPLC on a microbore Vydac CIS column (Cat. No. 218TP5115) 
in a Perkin Elmer/ Applied Biosystems Division Model 172 HPLC. Fractions were eluted 
with a linear gradient from 5-100% buffer (0,05% TFA in acetonitrile) with a flow rate of 80 

20 fil/minute. Eluent was monitored at 215 nm. Eight fractions were collected and tested for 
induction of DTH in M tuberculosis-infected guinea pigs. One fraction was found to induce 
strong DTH of about 16 mm induration. The other fractions did not induce detectable DTH. 
The positive fraction was submitted to SDS-PAGE gel electrophoresis and found to contain a 
single protein band of approximately 12 kD molecular weight. 

25 This polypeptide, herein after referred to as DPPD, was sequenced from the 

amino terminal using a Perkin Elmer/Applied Biosystems Division Procise 492 protein 
sequencer as described above and found to have the N-terminal sequence shown in SEQ ID 
NO:: 124. Comparison of this sequence with known sequences in the gene bank as described 
above revealed no known homologies. Four cyanogen bromide fragments of DPPD were 

30 isolated and found to have the sequences shown in SEQ ID NOS: 125-128. 
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EXAMPLE 5 
Synthesis of Synthetic Polypeptides 

5 Polypeptides may be synthesized on a Millipore 9050 peptide synthesizer 

using FMOC chemistry with HPTU (0-Benzotriazole-N,N,N\N'-tetramethyluronium 
hexafluorophosphate) activation. A Gly-Cys-Gly sequence may be attached to the amino 
terminus of the peptide to provide a method of conjugation or labeling of the peptide. 
Cleavage of the peptides from the solid support may be carried out using the following 

10 cleavage mixture: trifluoroacctic acid:ethanedithiol:thioanisole:water:phenol (40:1:2:2:3). 
After cleaving for 2 hours, the peptides may be precipitated in cold methyl-t-butyi-ether. The 
peptide pellets may then be dissolved in water containing 0.1% trifluoroacctic acid (TFA) and 
lyophilized prior to purification by CI 8 reverse phase HPLC. A gradient of 0-60% 
acetonitrile (containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the 

15 peptides. Following lyophilization of the pure fractions, the peptides may be characterized 
using electrospray mass spectrometry and by amino acid analysis. 

This procedure was used to synthesize a TbM-1 peptide that contains one and 
a half repeats of a TbM-1 sequence. The TbM-1 peptide has the sequence 
GCGDRSGGNLDQIRLRRDRSGGNL (SEQ ID NO: 63). 

20 

EXAMPLE 6 

Use of Representative Antigens for Serqdiagnosis of Tuberculosis 

25 This Example illustrates the diagnostic properties of several representative 

antigens. 

Assays were performed in 96-well plates were coated with 200 ng antigen 
diluted to 50 |.iL in carbonate coating buffer, pH 9.6. The wells were coated overnight at 4°C 
(or 2 hours at 37°C). The plate contents were then removed and the wells were blocked for 2 
30 hours with 200 |aL of PBS/1% BSA. After the blocking step, the wells were washed five 
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times with PBS/0.1% Twecn 20^^. 50 ^L sera, diluted 1:100 in PBS/0.1% Tween 20^^/0.1% 
BSA, was then added to each well and incubated for 30 minutes at room temperature. 1 he 
plates were then washed again five times with PBS/0.1% Tween 20"^^. 

The enzyme conjugate (horseradish peroxidase- Protein A, Zymcd, San 
5 Francisco, CA) was then diluted 1 : 10,000 in PBS/0.1% Tween 20^"/0.1% BSA, and 50 \xL of 
the diluted conjugate was added to each well and incubated for 30 minutes at room 
temperature. Following incubation, the wells were washed five times with PBS/0. 1% Tween 
20"^^. lOOjiL of tetramethylbenzidinc peroxidase (TMB) substrate (Kirkegaard and Perry 
Laboratories, Gaithersburg, MD) was added, undiluted, and incubated for about 15 minutes. 

10 The reaction was stopped with the addition of 100 of 1 N H2SO4 to each well, and the 
plates were read at 450 nm. 

Figure 4 shows the ELISA reactivity of two recombinant antigens isolated 
using method A in Example 3 (TbRa3 and TbRa9) with sera from M. tuberculosis positive 
and negative patients. The reactivity of these antigens is compared to that of bacterial lysate 

15 isolated from M tuberculosis strain H37Ra (Difco, Detroit, MI). In both cases, the 
recombinant antigens differentiated positive from negative sera. Based on cut-off values 
obtained from receiver-operator curves, TbRa3 detected 56 out of 87 positive sera, and 
TbRa9 detected 111 out of 1 65 positive sera. 

Figure 5 illustrates the ELISA reactivity of representative antigens isolated 

20 using method B of Example 3. llie reactivity of the recombinant antigens TbH4, TbH12, 
Tb38-1 and the peptide TbM-1 (as described in Example 4) is compared to that of the 38 kD 
antigen described by Andersen and Hansen, Infect. Immun. 57:2481-2488, 1989. Again, all 
of the polypeptides tested differentiated positive from negative sera. Based on cut-off values 
obtained from receiver-operator curves, TbH4 detected 67 out of 126 positive sera, TbH12 

25 detected 50 out of 125 positive sera, 38-1 detected 61 out of 101 positive sera and the TbM-1 
peptide detected 25 out of 30 positive sera. 

The reactivity of" four antigens (TbRa3, TbRa9, TbH4 and TbH12) with sera 
from a group of M. tuberculosis infected patients with differing reactivity in the acid fast stain 
of sputum (Smithwick and David, Tubercle 52:226, 1971) was also examined, and compared 



wo 98/16645 



47 



PCT/US97/18214 



to the reactivity of M tuberculosis lysate and the 38 kD antigen. The resuhs are presented 
Table 3, below: 

TABLE 3 

Reactivity of Antigens with Sera from M, tuberculosis Patients 



Patipnt 


Acid 
Fast 

Snutiim 


ELISA Values 


Lysate 38kD TbRa9 TbH12 TbH4 TbRa3 




1 1 1 1 


1.853 


0.634 


0.998 


1.022 


1.030 


1.314 


Tb01B93I-19 


+-(-1-+ 


2.657 


2.322 


0.608 


0.837 


1.857 


2.335 




++ + 


2.703 


0.527 


0.492 


0.281 


0.501 


2.002 


ThOI RQ3T-1 0 

A \J\J VlJJ J X V\J 


1 1 1 


1.665 


1.301 


0.685 


0.216 


0.448 


0.458 




+++ 


2.817 


0.697 


0.509 


0.301 


0.173 


2.608 


Tb01B93I-15 


+-H- 


1.28 


0.283 


0.808 


0.218 


1.537 


0.811 


Tb01B93I-16 


-f-H- 


2.908 


>3 


0.899 


0.441 


0.593 


1.080 


Tb01B93I-25 


+++ 


0.395 


0.131 


0.335 


0.211 


0.107 


0.948 


Tb01B93I-87 


+++ 


2.653 


2.432 


2.282 


0.977 


1.221 


0.857 


Tb01B93I-89 


+++ 


1.912 


2.370 


2.436 


0.876 


0.520 


0.952 


Tb01B94I-108 


-H-+ 


1.639 


0.341 


0.797 


0.368 


0.654 


0.798 


Tb01B94I-201 


+++ 


1.721 


0.419 


0.661 


0.137 


0.064 


0.692 


Tb01B93I-88 


++ 


1.939 


1.269 


2.519 


1.381 


0.214 


0.530 


Tb01B93I-92 


+-f 


2.355 


2.329 


2.78 


0.685 


0.997 


2.527 


Tb01B94I-109 


++ 


0.993 


0.620 


0.574 


0.441 


0.5 


2.558 


Tb01B94I-210 


++ 


2.777 


>3 


0.393 


0.367 


1.004 


1.315 


Tb01B94]-224 


++ 


2.913 


0.476 


0.251 


1.297 


1.990 


0.256 
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Acid 
Fast 


ELISA Values 


Patient 


Sputum 


Lysate 


38kD 


TbRa9 


TbH12 


TbH4 


TbRa3 


Tb01B93I-9 


+ 


2.649 


0.278 


0.210 


0.140 


0.181 


1.586 


Tb01B93I-14 


+ 


>3 


1.538 


0.282 


0.291 


0.549 


2.880 


Tb01B93I-21 


+ 


2.645 


0.739 


2.499 


0.783 


0.536 


1.770 


Tb01B93I-22 




0.714 


0.451 


2.082 


0.285 


0.269 


1.159 


Tb01B93I-31 


+ 


0.956 


0.490 


1.019 


0.812 


0.176 


1.293 


Tb01B93I-32 




2.261 


0.786 


0.668 


0.273 


0.535 


0.405 


Tb01B93I-52 




0.658 


0.114 


0.434 


0.330 


0.273 


1.140 


Tb01B93I-99 




2.118 


0.584 


1.62 


0.119 


0.977 


0.729 


Tb01B94I-130 




1.349 


0.224 


0.86 


0.282 


0.383 


2.146 


Tb01B94I-131 




0.685 


0.324 


1.173 


0.059 


0.118 


1.431 


AT4-0070 


Normal 


0.072 


0.043 


0.092 


0.071 


0.040 


0.039 


A T^/i nine 

AI4-0105 


Normal 


0.397 


0.121 


0.118 


0.103 


0.078 


0.390 


3/15/94-1 


Normal 


0.227 


0.064 


0.098 


0.026 


0.001 


0.228 


4/15/93-2 


Normal 


0.114 


0.240 


0.071 


0.034 


0.041 


0.264 


5/26/94-4 


Normal 


0.089 


0.259 


0.096 


0.046 


0.008 


0.053 


5/26/94-3 


Normal 


0.139 


0.093 


0.085 


0.019 


0.067 


0.01 



Based on cut-off values obtained from receiver-operator curves, TbRa3 
detected 23 out of 27 positive sera, TbRa9 detected 22 out of 27, TblM detected 18 out of 27 
and TbH12 detected 15 out of 27. If used in combination, these four antigens would have a 
theoretical sensitivity of 27 out of 27, indicating that these antigens should complement each 
other in the serological detection of M. tuberculosis infection. In addition, several of the 
recombinant antigens detected positive sera that were not detected using the 38 kD antigen, 
indicating that these antigens may be complementary to the 38 kD antigen. 
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The reactivity of the recombinant antigen TbRal 1 with sera from 
M tuberculosis patients shown to be negative for the 38 kD antigen, as well as with sera from 
PPD positive and normal donors, was determined by ELISA as described above. The results 
are shown in Figure 6 which indicates that TbRal 1, while being negative with sera from PPD 
5 positive and normal donors, detected sera that were negative with the 38 kD antigen. Of the 
thirteen 38 kD negative sera tested, nine were positive with TbRal 1, indicating that this 
antigen may be reacting with a sub-group of 38 kD cmtigen negative sera. In contrast, in a 
group of 38 kD positive sera where TbRal 1 was reactive, the mean OD 450 for TbRal 1 was 
lower than that for the 38 kD antigen. The data indicate an inverse relationship between the 

10 presence of TbRal 1 activity and 38 kD positivity. 

The antigen TbRa2A was tested in an indirect ELISA using initially 50 |ul of 
serum at 1:100 dilution for 30 minutes at room temperature followed by washing in PBS 
Tween and incubating for 30 minutes with biotinylatcd Protein A (Zymed, San Francisco, 
CA) at a 1:10,000 dilution. Following washing, 50 )il of streptavidin-horseradish peroxidase 

15 (Zymed) at 1:10,000 dilution was added and the mixture incubated for 30 minutes. After 
washing, the assay was developed with TMB substrate as described above. The reactivity of 
TbRa2A with sera from M tuberculosis patients and normal donors in shown in Table 4. The 
mean value for reactivity of TbRa2A with sera from M tuberculosis patients was 0.444 with 
a standard deviation of 0.309. The mean for reactivity with sera from normal donors was 

20 0,109 with a standard deviation of 0.029. Testing of 38 kD negative sera (Figure 7) also 
indicated that the TbRa2A antigen was capable of detecting sera in this category. 

TABLE 4 

Reactivity of TbRa2A with sera from M tuberculosis Patients and from Normal 
25 Donors 



Serum ID 


Status 


OD 450 


Tb85 


TB 


0.680 


Tb86 


TB 


0.450 


Tb87 


TB 


0.263 


Tb88 


TB 


0.275 


Tb89 


TB 


0.403 
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Tb91 


TB 


0.393 


Tb92 


TB 


0.401 


Tb93 


TB 


0.232 


Tb94 


TB 


0.333 


Tb95 


TB 


0.435 


Tb96 


TB 


0.284 


Tb97 


TB 


0.320 


Tb99 


TB 


0.328 


TblOO 


TB 


0.817 


TblOl 


TB 


0.607 


Tbl02 


TB 


0.191 


Tbl03 


TB 


0.228 


Tbl07 


TB 


0,324 


Tbl09 


TB 


1.572 


Tbll2 


TB 


0.338 


DL4-0176 


Normal 


0.036 


AT4-0043 


Normal 


0.126 


AT4-0044 


Normal 


0.130 


AT4-0052 


Normal 


0.135 


AT4-0053 


Normal 


0.133 


AT4-0062 


Normal 


0.128 


AT4-0070 


Normal 


0.088 


AT4-0091 


Normal 


0.108 


AT4-0100 


Normal 


0.106 


AT4-0105 


Normal 


0.108 


AT4-0109 


Normal 


0.105 



The reactivity of the recombinant antigen (g) (SEQ ID NO: 60) with sera from 
M tuberculosis patients and normal donors was determined by ELISA as described above. 
Figure 8 shows the resuhs of the titration of antigen (g) with four M. tuberculosis positive 
5 sera that were all reactive with the 38 kD antigen and with four donor sera. All four positive 
sera were reactive with antigen (g). 

The reactivity of the recombinant antigen TbH-29 (SEQ ID NO: 137) with 
sera from M tuberculosis patients, PPD positive donors and normal donors was determined 
by indirect ELISA as described above. The results arc shown in Figure 9. TbH-29 detected 
10 30 out of 60 M. tuberculosis sera, 2 out of 8 PPD positive sera and 2 out of 27 normal sera. 

Figure 10 shows the results of ELISA tests (both direct and indirect) of the 
antigen TbH-33 (SEQ ID NO: 140) with sera from M tuberculosis patients and from normal 
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donors and with a pool of sera from M tuberculosis patients. The mean OD 450 was 
demonstrated to be higher with sera from M tuberculosis patients than from normal donors, 
with the mean OD 450 being significantly higher in the indirect ELISA than in the direct 
ELISA. Figure 11 is a titration curve for the reactivity of recombinant TbH-33 with sera 
5 from M. tuberculosis patients and from normal donors showing an increase in OD 450 with 
increasing concentration of antigen. 

The reactivity of the recombinant antigens RDIF6, RDIF8 and RDIFIO (SEQ 
ID NOS: 184-187, respectively) with sera from M tuberculosis patients and normal donors 
was determined by ELISA as described above. RDIF6 detected 6 out of 32 M tuberculosis 
10 sera and 0 out of 15 normal sera; RDIF8 detected 14 out of 32 M tuberculosis sera and 0 out 
of 15 normal sera; and RDIFIO detected 4 out of 27 M. tuberculosis sera and 1 out of 15 
normal sera. In addition, RDIFIO was found to detect 0 out of 5 sera from PPD-positive 
donors. 

15 EXAMPLE 7 

Preparation and Characterization of M. Tuberculosis Fusion Proteins 

A fusion protein containing TbRa3, the 38 kD antigen and Tb38-1 was 
prepared as follows. 

20 Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PGR 

in order to facilitate their fusion and the subsequent expression of the fusion protein TbRa3- 
38 kD-Tb38-l . TbRa3, 38 kD and Tb38-1 DNA was used to perform PGR using the primers 
PDM-64 and PDM-65 (SEQ ID NO: 141 and 142), PDM-57 and PDM-58 (SEQ ID NO: 143 
and 144), and PDM-69 and PDM-60 (SEQ ID NO: 145-146), respectively. In each case, the 

25 DNA amplification was performed using 10 |il lOX Pfu buffer, 2 |il 10 mM dNTPs, 2 |al each 
of the PGR primers at 10 jaM concentration, 81.5 |il water, 1.5 |il Pfu DNA polymerase 
(Stratagene, La Jolla, CA) and 1 \x\ DNA at either 70 ng/fil (for TbRa3) or 50 ng/\x\ (for 38 
kD and Tb38-1). For TbRa3, denaturation at 94°C was performed for 2 min, followed by 40 
cycles of 96°G for 15 sec and 72°G for 1 min, and lastly by 72°C for 4 min. For 38 kD, 

30 denaturation at 96°C was performed for 2 min, followed by 40 cycles of 96°G for 30 sec, 
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68°C for 15 sec and 72°C for 3 tnin, and finally by 72^C for 4 min. For Tb38-1 denaturation 
at 94°C for 2 min was followed by 10 cycles of 96°C for 15 sec, 68^C for 15 sec and 72°C for 
1.5 min, 30 cycles of 96*^0 for 15 sec, 64°C for 15 sec and 72°C for 1.5, and finally by 72°C 
for 4 min. 

5 The TbRa3 PGR fragment was digested with Ndel and EcoRI and cloned 

directly into pT7^L2 IL 1 vector using Ndel and EcoRI sites. The 38 kD PGR fragment was 
digested with Sse8387I, treated with T4 DNA polymerase to make blunt ends and then 
digested with HcoRI for direct cloning into the pT7^L2Ra3-l vector which was digested with 
StuI and EcoRI. The 38-1 PGR fragment was digested with Eco47IIl and EcoRI and directly 

10 subcloned into pT7^L2Ra3/38kD-17 digested with the same enzymes. The whole fusion was 
then transferred to pET28b using Ndel and EcoRI sites. The fusion construct was confirmed 
by DNA sequencing. 

The expression construct was transformed to BLR pLys S E. coli (Novagen, 
Madison, WI) and grown overnight in LB broth with kanamycin (30 |ag/ml) and 

15 chloramphenicol (34 |ig/ml). This culture (12 ml) was used to inoculate 500 ml 2XYT with 
the same antibiotics and the culture was induced with IPTG at an OD560 of 0.44 to a final 
concentration of 1.2 mM. Four hours post-induction, the bacteria were harvested and 
sonicated in 20 mM Tris (8.0), 100 mM NaCl, 0.1% DOG, 20 |ig/ml Leupeptin, 20 mM 
PMSF followed by centrifiigation at 26,000 X g. The resulting pellet was resuspended in 8 M 

20 urea, 20 mM Tris (8.0), 100 mM NaGl and bound to Pro-bond nickel resin (Invitrogen, 
Garlsbad, GA). The column was washed several times with the above buffer then eluted with 
an imidazole gradient (50 mM, 100 mM, 500 mM imidazole was added to 8 M urea, 20 mM 
Tris (8.0), 100 mM NaGl). The eluates containing the protein of interest were then dialzyed 
against 10 mM Tris (8.0). 

25 The DNA and amino acid sequences for the resulting fusion protein 

(hereinafter referred to as TbRa3-38 kD-Tb38-l) arc provided in SEQ ID NO: 147 and 148, 
respectively. 

A fusion protein containing the two antigens TbIL9 and Tb38-1 (hereinafter 
referred to as TbH9-Tb38-l) without a hinge sequence, was prepared using a similar 
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procedure to that described above. The DNA sequence for the TbH9-Tb38-l fusion protein is 
provided in SEQ ID NO: 151. 

A fusion protein containing TbRaB, the antigen 38kD, Tb38-1 and DPEP was 
prepared as follov^s. 

5 Each of the DNA constructs TbRa3, 38 kD and Tb38-1 w^ere modified by PGR 

and cloned into vectors essentially as described above, with the primers PDM-69 (SEQ ID 
NO: 145 and PDM-83 (SEQ ID NO: 200) being used for amplification of the Tb38-1A 
fragment. Tb38-1 A differs from Tb38-1 by a Dral site at the 3' end of the coding region that 
keeps the final amino acid intact while creating a blunt restriction site that is in frame. The 

10 TbRa3/38kD/Tb38-l A fusion was then transferred to pET28b using Ndel and EcoRl sites. 

DPEP DNA was used to perform PGR using the primers PDM-84 and PDM- 
85 (SEQ ID NO: 201 and 202, respectively) and 1 ^1 DNA at 50 ng/^1. Denaturation at 94 °G 
was performed for 2 min, followed by 10 cycles of 96 °G for 15 sec, 68 °G for 15 sec and 72 
^G for 1.5 min; 30 cycles of 96 T for 15 sec, 64 °C for 15 sec and 72 T for 1.5 min; and 

15 finally by 72 ^G for 4 min. The DPEP PGR fragment was digested with EcoRI and Eco72I 
and clones directly into the pET28Ra3/38kD/38-l A construct which was digested with Dral 
and EcoRI. The fusion construct was confirmed to be correct by DNA sequencing. 
Recombinant protein was prepared as described above. The DNA and amino acid sequences 
for the resulting fusion protein (hereinafter referred to as TbF-2) are provided in SEQ ID NO: 

20 203 and 204, respectively. 

EXAMPLE 8 
Use of M. Tuberculosis Fusion Proteins for 
Sbrodiagnosis of Tuberculosis 

25 

The effectiveness of the fusion protein rbRa3-38 kD-Tb38-l, prepared as 
described above, in the serodiagnosis of tuberculosis infection was examined by ELISA. 

The ELISA protocol was as described above in Example 6, with the fusion 
protein being coated at 200 ng/well. A panel of sera was chosen from a group of tuberculosis 
30 patients previously shown, either by ELISA or by western blot analysis, to react with each of 
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the three antigens individually or in combination. Such a panel enabled the dissection of the 
serological reactivity of the fusion protein to determine if all three epitopes functioned with 
the fusion protein. As shown in Table 5, all four sera that reacted with TbRa3 only were 
detectable with the fusion protein. Three sera that reacted only with Tb38-1 were also 
5 detectable, as were two sear that reacted with 38 kD alone. The remaining 15 sera were all 
positive with the fusion protein based on a cut-off in the assay of mean negatives +3 standard 
deviations. This data demonstrates the functional activity of all three epitopes in the fusion 
protein. 

10 Table 5 

Reactivity of Tri-Peptide Fusion Protein with Sera from M, tuberculosis Patients 



oerum lU 


Status 


ELISA and/or Western Blot 


rusion 


Fusion 






Reactivity with Individual proteins 


I CLUIIlUillalll 


T< ^>r*/^m rM n q n't' 

rvecujiiuiiiaui 






38kd 


Tb38-1 


TbRa3 


OD 450 


Status 


01B93I-40 


TB 






+ 


0.413 




01B93I-41 


TB 




+ 




0.392 


+ 


01B93I-29 


TB 


+ 






2.217 


-f- 


01B93I-109 


TB 




± 




0.522 




01B93I-132 


TB 


+ 


+ 




0.937 




5004 


TB 


± 




± 


1.098 




15004 


TB 




+ 


+ 


2.077 




39004 


TB 


+ 


-f 


+ 


1.675 




68004 


TB 


+ 


+ 


+ 


2.388 




99004 


TB 






db 


0.607 




107004 


TB 




4- 


± 


0.667 




92004 


TB 




± 


± 


1.070 




97004 


TB 






± 


1.152 


+ 


118004 


TB 








2.694 


+ 


173004 


TB 






+ 


3.258 


+ 


175004 


TB 


+ 




+ 


2.514 




274004 


IB 






+ 


3.220 




276004 


TB 








2.991 




282004 


TB 








0.824 
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289004 


TB 








0.848 


+ 


308004 


TB 


- 


4- 




3.338 


-t- 


314004 


TB 








1.362 


-1- 


317004 


TB 


+ 






0.763 


+ 


312004 


TB 




- 


+ 


1.079 


+ 


D176 


PPD 


- 


- 




0.145 


- 


D162 


PPD 


- 


- 




0.073 


- 


D16I 


PPD 


- 


- 




0.097 




D27 


PPD 








0.082 




A6-124 


NORMAL 


- 


- 




0.053 




A6-125 


NORMAL 








0,087 




A6-126 


NORMAL 




- 




0.346 


± 


A6-127 


NORMAL 


- 






0.064 




A6-128 


NORMAL 








0.034 




A6-129 


NORMAL 


- 


* 




0.037 




A6-130 


NORMAL 








0.057 




A6-131 


NORMAL 








0.054 




A6-132 


NORMAL 


- 


- 




0.022 


— 


A6-133 


NORMAL 


- 


- 




0,147 




A6-134 


NORMAL 


- 


- 




0.101 




A6-135 


NORMAL 


- 


- 




0.066 


" 


A6-136 


NORMAL 








0.054 




A6-137 


NORMAL 








0.065 




A6-138 


NORMAL 








0.041 




A6-139 


NORMAL 








0.103 




A6-140 


NORMAL 








0.212 




A6-141 


NORMAL 








0.056 




A6-142 


NORMAL 








0.051 





The reactivity of the fusion protein rbF-2 with sera from M tuberculosis- 
infected patients was examined by ELISA using the protocol described above. The results of 
these studies (Table 6) demonstrate that all four antigens function independently in the fusion 
5 protein. 
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Table 6 

Reactivity of TbF-2 Fusion Protein with TB and Normal Sera 



Serum ID 


Status 


TbF 
OD450 


Status 


TbF-2 
OD450 


Status 


ELISA Reactivity 














38 kD 


TbRa3 


Tb38-1 


DPEP 


B931-40 


TB 


0.57 


+ 


0.321 


+ 


- 




- 


+ 


B931-41 


TB 


0.601 


+ 


0.396 










- 


B931-109 


TB 


0.494 




0.404 




+ 


+ 


± 


- 


B931-132 


TB 


L502 


+ 


1.292 


+ 




+ 




± 


5004 


TB 


1.806 




1.666 




± 


± 




- 


15004 


TB 


2.862 




2.468 






+ 




- 


39004 


TB 


2.443 


+ 


L722 


+ 


+ 


+ 




- 


68004 


TB 


2.871 




2.575 


-i- 




+ 




- 


99004 


TB 


0.691 


+ 


0.971 




- 


± 




- 


107004 


TB 


0.875 




0.732 


+ 


- 


± 




- 


92004 


TB 


1.632 




L394 




-f- 


± 


± 


- 


97004 


TB 


1.491 


+ 


1.979 




+ 


± 


- 




118004 


TB 


3.182 


+ 


3.045 




+ 


± 


- 


- 


173004 


TB 


3.644 


+ 


3.578 




+ 


+ 






175004 


TB 


3.332 


+ 


2.916 




4- 


•4 


- 


- 


274004 


TB 


3.696 


+ 


3.716 




- 


+ 


- 




276004 


TB 


3.243 




2.56 


+ 


- 


- 


+ 


- 


282004 


TB 


1.249 


+ 


1.234 


+ 




- 


- 


- 


289004 


TB 


1.373 


+ 


1.17 


+ 


- 


+ 


- 


- 


308004 


TB 


3.708 


+ 


3.355 




- 


- 


-f- 


- 


314004 


TB 


1.663 


+ 


1.399 




- 


- 




- 


317004 


TB 


1.163 




0.92 




+ 


- 




- 


312004 


TB 


1.709 




1.453 


+ 


- 




- 


- 


380004 


TB 


0.238 


- 


0.461 




- 


± 


- 


+ 


451004 


TB 


0.18 


- 


0.2 


- 


- 


- 




± 


478004 


TB 


0.188 


- 


0.469 


+ 










410004 


TB 


0.384 




2.392 


+ 


± 






+ 


411004 


TB 


0.306 




0.874 






+ 




4 


421004 


TB 


0.357 




1.456 


+ 








+ 


528004 


TB 


0.047 




0.196 










+ 


A6-87 


Normal 


0.094 




0.063 












A6-88 


Normal 


0.214 




0.19 












A6-89 


Normal 


0.248 




0.125 












A6-90 


Normal 


0.1 7P 




0.206 












A6-9] 


Normal 


0.135 




0.151 












A6-92 


Normal 


0.064 




0.097 












A6-93 


Normal 


0.072 




0.098 












A6-94 


Normal 


0.072 




0.064 












A6-95 


Normal 


0.125 




0.159 












A6-96 


Normal 


0-121 




0.12 
































Cut-off 




0.284 




0.266 
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One of skill in the art will appreciate that the order of the individual antigens 
within the fusion protein may be changed and that comparable activity would be expected 
provided each of the epitopes is still functionally available. In addition, truncated forms of 
the proteins containing active epitopes may be used in the construction of fusion proteins. 

From the foregoing, it will be appreciated that, although specific embodiments 
of the invention have been described herein for the purpose of illustration, various 
modifications may be made without deviating from the spirit and scope of the invention. 
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SEQUENCE LISTING 



(!) GENERAL INFORMATION: 

(i) APPLICANTS: Reed, Steven G. 

Skeiky, Yasir A.W. 
Dillon, Davin C. 
Campos-Neto, Antonia 
Houghton, Raymond 
Vedvick, Thomas S. 
Twardzik, Daniel R. 
Lodes, Michael J. 

(ii) TITLE OF INVENTION: COMPOUNDS AND METHODS FOR DIAGNOSIS OF 

TUBERCULOSIS 

(iii) NUMBER OF SEQUENCES: 209 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center, 701 Fifth Avenue 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104-7092 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: Ol-OCT-1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Maki, David J. 

(B) REGISTRATION NUMBER: 31,392 

(C) REFERENCE/DOCKET NUMBER: 210121. 417C7 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (206) 622-4900 
(LM TELEFAX: (206) 682-6031 



(2) INEORMATION FOR SEQ ID N0:1; 

(ij SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 7 6h base parrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

CGAGGCACCG GTAGTTTGAA CCAAACGCAC AATCGACGGG CAAACGAACG GAAGAACACA 60 

ACCATGAAGA TGGTGAAATC GATCGCCGCA GGTCTGACCG CCGCGGCTGC AATCGGCGCC 120 

GCTGCGGCCG GTGTGACTTC GATCATGGCT GGCGGCCCGG TCGTATACCA GATGCAGCCG 180 

GTCGTCTTCG GCGCGCCACT GCCGTTGGAC CCGGCATCCG CCCCTGACGT CCCGACCGCC 240 

GCCCAGTTGA CCAGCCTGCT CAACAGCCTC GCCGATCCCA ACGTGTCGTT TGCGAACAAG 300 

GGCAGTCTGG TCGAGGGCGG CATCGGGGGC ACCGAGGCGC GCATCGCCGA CCACAAGCTG 3 60 

AAGAAGGCCG CCGAGCACGG GGATCTGCCG CTGTCGTTCA GCGTGACGAA CATCCAGCCG 4 20 

GCGGCCGCCG GTTCGGCCAC CGCCGACGTT TCCGTCTCGG GTCCGAAGCT CTCGTCGCCG 4 80 

GTCACGCAGA ACGTCACGTT CGTGAATCAA GGCGGCTGGA TGCTGTCACG CGCATCGGCG 54 0 

ATGGAGTTGC TGCAGGCCGC AGGGNAACTG ATTGGCGGGC CGGNTTCAGC CCGCTGTTCA 600 

GCTACGCCGC CCGCCTGGTG ACGCGTCCAT GTCGAACACT CGCGCGTGTA GCACGGTGCG 660 

GTNTGCGCAG GGNCGCACGC ACCGCCCGGT GCAAGCCGTC CTCGAGATAG GTGGTGNCTC 7 20 

GNCACCAGNG ANCACCCCCN NNTCGNCNNT TCTCGNTGMT GNATGA 7 66 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 752 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE3S: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ATGCATCACC ATCACCATCA CGATGAAGTC ACGGTAGAGA CGACCTi:CGT CTTCCGCGCA bO 

GACTTCCTCA GCGAGCTGGA CGCTCCTGCG CAAGCGGGTA CGGAGAGCGC GGTCTCCGGG 120 

GTGGA.^\GGGC TCCCGCCGGG CTCGGCGTTG CTGGTAGTCA AACGAGGCCC CAACGCCGGG 180 

TCCCGGTTCC TACTCGACCA AGCCATCACG TCGGCTGGTC GGCATCCCGA CAGCGACATA 24 0 
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TTTCTCGACG ACGTGACCGT GAGCCGTCGC CATGCTGAAT TCCGGTTGGA AAACAACGAA 300 

TTCAATGTCG TCGATGTCGG GAGTCTCAAC GGCACCTACG TCAACCGCGA GCCCGTGGAT 360 

TCGGCGGTGC TGGCGAACGG CGACGAGGTC CAGATCGGCA AGCTCCGGTT GGTGTTCTTG 420 

ACCGGACCCA AGCAAGGCGA GGATGACGGG AGTACCGGGG GCCCGTGAGC GCACCCGATA 4 80 

GCCCCGCGCT GGCCGGGATG TCGATCGGGG CGGTCCTCCG ACCTGCTACG ACCGGATTTT 54 0 

CCCTGATGTC CACCATCTCC AAGATTCGAT TCTTGGGAGG CTTGAGGGTC NGGGTGACCC 600 

CCCCGCGGGC CTCATTCNGG GGTNTCGGCN GGTTTCACCC CNTACCNACT GCCNCCCGGN 660 

TTGCNAATTC NTTCTTCNCT GCCCNNAAAG GGACCNTTAN CTTGCCGCTN GAAANGGTNA 72 0 

TCCNGGGCCC NTCCTNGAAN CCCCNTCCCC CT 7 52 
(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CATATGCATC ACCATCACCA TCACACTTCT AACCGCCCAG CGCGTCGGGG GCGTCGAGCA 60 

CCACGCGACA CCGGGCCCGA TCGATCTGCT AGCTTGAGTC TGGTCAGGCA TCGTCGTCAG 12 0 

CAGCGCGATG CCCTATGTTT GTCGTCGACT CAGATATCGC GGCAATCCAA TCTCCCGCCT 10 0 

GCGGCCGGCG GTGCTGCAAA CTACTCCCGG AGGAATTTCG ACGTGCGCAT CAAGATCTTC 24 0 

ATGCTGGTCA CGGCTGTCGT TTTGCTCTGT TGTTCGGGTG TGGCCACGGC CGCGCCCAAG 300 

ACCTACTGCG AGGAGTTGAA AGGCACCGAT ACCGGCCAGG CGTGCCAGAT TCAAATGTCC 360 

GACCCGGCCT ACAACATCAA CATCAGCCTG CCCAGTTACT ACCCCGACCA GAAGTCGCTG 420 

G/\AAATTACA TCGCCCA.:-AC Gi:GCGACAAG TTCCTCAGCG CGGCCACATC GTCCACTCCA 4 80 

'.:GCGAAGCCC CCTACGf-J^/vr GAATATCACC TCGGCCACAT ACCAGTCCGC GATACCGCCG 54 0 

CGTGGTACciC AGGCCGT ZGT GCTCAMGGTC TACCACAACC; CCGGCGG'.:AC GCACCCAACG 000 

ACCACGTAOA AGGCCTTCGA TTGGGACCAG GCCTATCGCA AGCCAATCAC CTATGACACG 660 

CTGTGGCAGG CTGACACCGA TCC:;CTGCCA GTCGTCTTCC CCATTGTTCC AAGGTGAACT 720 



wo 98/16645 



PCT/US97/18214 



61 



GAGCAACGCA GACCGGGACA ACWGGTATCG ATAGCCGCCN AATGCCGGCT TGGAACCCNG 780 
TGAAATTATC ACAACTTCGC AGTCACNAAA NAA 813 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CGGTATGAAC ACGGCCGCGT CCGATAACTT CCAGCTGTCC CAGGGTGGGC AGGGATTCGC 60 

CATTCCGATC GGGCAGGCGA TGGCGATCGC GGGCCAGATC CGATCGGGTG GGGGGTCACC 12 0 

CACCGTTCAT ATCGGGCCTA CCGCCTTCCT CGGCTTGGGT GTTGTCGACA ACAACGGCAA 18 0 

CGGCGCACGA GTCCAACGCG TGGTCGGGAG CGCTCCGGCG GCAAGTCTCG GCATCTCCAC 24 0 

CGGCGACGTG ATCACCGCGG TCGACGGCGC TCCGATCAAC TCGGCCACCG CGATGGCGGA 300 

CGCGCTTAAC GGGCATCATC CCGGTGACGT CATCTCGGTG AACTGGCAAA CCAAGTCGGG 360 

CGGCACGCGT ACAGGGAACG TGACATTGGC CGAGGGACCC CCGGCCTGAT TTCGTCGYGG 4 20 

ATACCACCCG CCGGCCGGCC AATTGGA 4^7 
(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 base pairs 

(B) TYPF.: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : .b : 

GTCCCACTGC GGTCGCCGAG TATGTCGCCC AGCAAATGTC TGGCAGCCGC CCAACGGAAT 60 

CCGGTGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC GGAAGTATCG GTCCATGCCT 120 

AGCCCGGCGA CGGCGAGCGC CGGAATGGCG CGAGT':^AGGA GGCGGGCAAT TTGGCGi;-;GGC 180 
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CCGGCGACGG NGAGCGCCGG AATGGCGCGA GTGAGGAGGT GGNCAGTCAT GCCCAGNGTG 24 0 

ATCCAATCAA CGTGNATTCG GNCTGNGGGN CCATTTGAGA ATCGAGGTAG TGAGCGCAAA 300 

TGAJ^TGATGG AAAACGGGNG GNGACGTCCG NTGTTCTGGT GGTGNTAGGT GNCTGNCTGG 360 

NGTNGNGGNT ATCAGGATGT TCTTCGNCGA AANCTGATGN CGAGGAACAG GGTGTNCCCG 4 20 

NNANNCCNAN GGNGTCCNAN CCCNNNNTGC TCGNCGANAT CANANAGNCG NTTGATGNGA 4 80 

NAAAAGGGH.^ GANCAGNNNN AANTNGNGGN CCNAANAANC NNNANNGNNG NNAGNTNGNT 54 0 

NNNTNTTNNC ANNNNNNNTG NNGNNGNNCN NNNCAANCNN NTNNNNGNAA NNGGNTTNTT 600 
NAAT 60 4 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE3S: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TTGCANGTCG AACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCG CGGTGGCGGC 60 

CGCTCTAGAA CTAGTGKATM YYYCKGGCTG CAGSAATYCG GYACGAGCAT TAGGACAGTC 120 

TAACGGTCCT GTTACGGTGA TCG7VATGACC GACGACATCC TGCTGATCGA CACCGACGAA 180 

CGGGTGCGAA CCCTCACCCT CAACCGGCCG CAGTCCCGYA ACGCGCTCTC GGCGGCGCTA 24 0 

CGGGATCGGT TTTTCGCGGY GTTGGYCGAC GCCGAGGYCG ACGACGACAT CGACGTCGTC 300 

ATCCTCACCG GYGCCGATCC GGTGTTCTGC '^CCGGACTC^G ACCTCAAGGT AGCTGGCCGG 3 60 

GCAGACCGCG CTGCCGGACA TCTCACCGCG GTGGGCGGCC ATGACCAAGC CGGTGATCGG 4 20 

CGCGATCAAC (.^GCGCCi^CGG TCACCGGCGG GCTCGAACTG GCGCTGTACT GCGACATCCT 4 80 

GATCGCCTCC GAGCAC':^CCC GCTTCGNCGA CACCCACGC-: CGGGTGGGGC TGCTGCCCAC bAO 

CTGGGGA'^TC AGTGTGTGCT TGCCGCAA^^A GGTCGGCATC GGNCTGGGCC GGTGGATGAG bOO 

CCTGACC';G'': GIACTACCTGT CCGTGACCGA CGC 633 
(2) INF0RI4ATI0N FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1362 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{D} TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CGACGACGAC GGCGCCGGAG AGCGGGCGCG AACGGCGATC GACGCGGCCC TGGCCAGAGT 60 

CGGCACCACC CAGGAGGGAG TCGAATCATG AAATTTGTCA ACCATATTGA GCCCGTCGCG 12 0 

CCCCGCCGAG CCGGCGGCGC GGTCGCCGAG GTCTATGCCG AGGCCCGCCG CGAGTTCGGC 18 0 

CGGCTGCCCG AGCCGCTCGC CATGCTGTCC CCGGACGAGG GACTGCTCAC CGCCGGCTGG 2-10 

GCGACGTTGC GCGAGACACT GCTGGTGGGC CAGGTGCCGC GTGGCCGCAA GGAAGCCGTC 30 0 

GCCGCCGCCG TCGCGGCCAG CCTGCGCTGC CCCTGGTGCG TCGACGCACA CACCACCATG 360 

CTGTACGCGG CAGGCCAAAC CGACACCGCC GCGGCGATCT TGGCCGGCAC AGCACCTGCC 420 

GCCGGTGACC CGAACGCGCC GTATGTGGCG TGGGCGGCAG GAACCGGGAC ACCGGCGGGA 4 80 

CCGCCGGCAC CGTTCGGCCC GGATGTCGCC GCCGAATACC TGGGCACCGC GGTGCAATTC 54 0 

CACTTCATCG CACGCCTGGT CCTGGTGCTG CTGGACGAAA CCTTCCTGCC GGGGGGCCCG 600 

CGCGCCCAAC AGCTCATGCG CCGCGCCGGT GGACTGGTGT TCGCCCGCAA GGTGCGCGCG 660 

GAGCATCGGC CGGGCCGCTC CACCCGCCGG CTCGAGCCGC GAACGCTGCC CGACGATCTG 72 0 

GCATGGGCAA CACCGTCCGA GCCCATAGCA ACCGCGTTCG CCGCGCTCAG CCACCACCTG 7K0 

GACACCGOGO CGCACCTGCC GCCACCGACT CGTCACGTGG TCAGGCGGGT CGTGGGGTCG yOO 

TGGCACGGCG AGCCAATGCC GATGAGCAGT Ci^CTGGACGA ACGAGCACAC ':Gf:CGAGCTG 9i)0 

CCCGCCGACC TGCACGCGCC CACCCGTCTT GCCCTGCTGA CCGGCCTGGC CCCGCATCAG 960 

GTGACCGACG ACGACGTCGC CGCGGCCCGA TCCCTGCTGG ACACCGATGC (^GCGCTGGTT 1020 

GGCGCCCTG13 CCTGGGCCGC CTTCACCGCC GCGCGGCG ::A TCGGCAOGTG GATCGGCGCC IOhO 

GCCGCCGAGG GCCAGGTGTC i^^CGGCAAAAC CCGACTGGGT GAGTGTGCGC GCCCTGTCGG 114 0 

TAGGGTGTCA TCGCTGGCGC GAGGi:^,ATC'PC G'^i'^GGGGCGA ACGGAGGTGG GGACACAGGT 12i)0 

ggaagct:.gg :c:actgggt tgcgcccc.aa cgcC':.tggtg ggcgtti:ggt tggccgcact 12.)0 

GGCCGATGAG GT2GGCGCGG I'iCCCTTGCGC GAAGGTCCAG CTCAACGTGC CGTCACCGAA 132 0 
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GGACCGGACG GTCACCGGGG GTCACCCTGC GCGCCCAAGG AA 13 62 
(2) INFORMATION TOR S\lQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1458 base pairs 

(B) TYPE: nucleic acid 

(C) 3TRANDEDNES3: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GCGACGACCC CGATATGCCG GGCACCGTAG CGAAAGCCGT CGCCGACGCA CTCGGGCGCG 60 

GTATCGCTCC CGTTGAGGAC ATTCAGGACT GCGTGGAGGC CCGGCTGGGG GAAGCCGGTC 120 

TGGATGACGT GGCCCGTGTT TACATCATCT ACCGGCAGCG GCGCGCCGAG CTGCGGACGG 180 

CTAAGGCCTT GCTCGGCGTG CGGGACGAGT TAAAGCTGAG CTTGGCGGCC GTGACGGTAC 240 

TGCGCGAGCG CTATCTGCTG CACGACGAGC AGGGCCGGCC GGCCGAGTCG ACCGGCGAGC 300 

TGATGGACCG ATCGGCGCGC TGTGTCGCGG CGGCCGAGGA CCAGTATGAG CCGGGCTCGT 3 60 

CGAGGCGGTG GGCCGAGCGG TTCGCCACGC TATTACGCAA CCTGGAATTC CTGCCGAATT 42 0 

CGCCCACGTT GATGAACTCT GGCACCGACC TGGGACTGCT CGCCGGCTGT TTTGTTCTGC 4 80 

CGATTGAGGA TTCGCTGCAA TCGATCTTTG CGACGCTGGG ACAGGCCGCC GAGCTGCAGC b4 0 

GGGCTGGAGG CGGCACCGGA TATGCGTTCA GCCACCTGCG ACCCGCCGGG GATCGGGTGG 60 0 

CCTCCACGGG CGGCACGGCC AGCGGACCGG TGTCGTTTCT ACGGCTGTAT GACAGTGCCG 6 60 

CGGGTGTGGT CTCCATGGGC GGTCGCCGGC GTGGCGCCTG TATGGCTGTG CTTGATGTGT 72 0 

CGCACCCGGA TATCTGTGAT TTCGTCACCG CCAAGGCCGA ATCCCCCAGC GAGCTCCCGC 78 0 

ATTTCAACCT ATCGGTTGGT GTGACCGACG CGTTCCTGCG GGCCGTCGAA CGCAACGGCC 84 0 

TACACCGGCT GGTCA/^TCCG CGAACCGGCA AGATCGTCGC GCGGATGCCC GCCGCCGAGC 900 

TGTTCGACGC CATCTGCAAA GCCGCGCACG CCGGTGGC':;A TCCCGGGCTG GTGTTTCTCG 9 60 

ACACGATCAA TAGGGC7V.\A: CCGGTGCCGG i^GAGAGGCCG CATCGAGGCG ACCAACCCGT 1020 

GCGGGGA:,GT CCCAC';'i:^CT'^. CCTTACGA^T CATGT.^TCT CGGCTi:GATC AACCTCGCCC lOHO 

GGATGCTCGC CGACGGTCir.C GTCGACTGGG ACCGGCTCGA GGAGGTCGCC GGTGTGGCGG 114 0 

TGCGGTTCCT TGATGACGTC ATCGATGT:A GCCGCTACCC CTTCCCCGAA CTGGGTGAGG 12O0 
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CGGCCCGCGC CACCCGCAAG ATCGGGCTGG GAGTCATGGG TTTGGCGGAA CTGCTTGCCG 12 60 

CACTGGGTAT TCCGTACGAC AGTG.AAGAAG CCGTGCGGTT AGCCACCCGG CTCATGCGTC 1320 

GCATACAGCA GGCGGCGCAC ACGGCATCGC GGAGGCTGGC CGAAGAGCGG GGCGCATTCC 138 0 

CGGCGTTCAC CGATAGCCGG TTCGCGCGGT CGGGCCCGAG GCGCAACGCA CAGGTCACCT 14^10 

CCGTCGCTCC GACGGGCA 1458 
(2) INFORMATION FOR 3EQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ACGGTGTAAT CGTGCTGGAT CTGGAACCGC GTGGCCCGCT ACCTACCGAG ATCTACTGGC 60 

GGCGCAGGGG GCTGGCCCTG GGCATCGCGG TCGTCGTAGT CGGGATCGCG GTGGCCATCG 120 

TCATCGCCTT CGTCGACAGC AGCGCCGGTG CCAAACCGGT CAGCGCCGAC AAGCCGGCCT 180 

CCGCCCAGAG CCATCCGGGC TCGCCGGCAC CCCAAGCACC CCAGCCGGCC GGGCAAACCG 24 0 

AAGGTAACGC CGCCGCGGCC CCGCCGCAGG GCCAATVACCC CGAGACACCC ACGCCCACCG 300 

CCGCGGTGCA GCCGCCGCCG GTGCTCAAGG AAGGGGACGA TTGCCCCGAT TCGACGCTGG 360 

CCGTCAAAGG TTTGA'T-CAA^ GCGCCGCAGT ACTACGTCGG CGACCAGCCG aiAGTTCACCA Al'.iJ 

TGGTGGTCAC CAACATCGGT CTGGTGTCCT GTAAACGC3A CGTTGGGGCC GCGGTGTTGG 4^0 

CCGCCTACGT TTACTCGCTG GACAACAAGC GGTTGTGGTC CAACCTGGAC TGCGCGCCCT 54 0 

CGAATGAGAC GCTGGTCAAG ACGTTTTCCC CCGGTGAGCA GGTAACGACC GCGGTGACCT 600 

GGACCGGGAT GGGATCGGC-:^ CCGCGCTGCC CATTGCCGCG GCCGGCGATC GGGCCGGGCA 6bO 

CCTACAATCT CGTGGTACAA. CTGGGCAATC TGCGCTCGCT GCCGGTTCCG TTCATCCTGA 7:!0 

ATCAGCCGCC GCCGC':3CC: GGGCCGGTAC CCGCTCCGGG TCCAGCGCAG GCGCi::TCCGC 7H0 

CGGAGTCTCC :GCGCAAGGC GGATAATTAT T3ATCGCTGA T'^GTCGATTC CGCCAGCTGT 8 4 0 

GACAACCCCT CGCCTCGTGC CG Bb2 
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(.?) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TTGATCAGCA CCGGCAAGGC GTCACATGCC TCCCTGGGTG TGCAGGTGAC CAATGACAAA 60 

GACACCCCGG GCGCCAAGAT CGTCGAAGTA GTGGCCGGTG GTGCTGCCGC GAACGCTGGA 120 

GTGCCGAAGG GCGTCGTTGT CACCAAGGTC GACGACCGCC CGATCAACAG CGCGGACGCG 180 

TTGGTTGCCG CCGTGCGGTC CAAAGCGCCG GGCGCCACGG TGGCGCTAAi: CTTTCAGGAT 240 

CCCTCGGGCG GTAGCCGCAC AGTCCAAGTC ACCCTCGGCA AGGCGGAGCA GTGATGAAGG 300 

TCGCCGCGCA GTGTTCAAAG CTCGGATATA CGGTGGCACC CATGGAACAG CGTGCGGAGT 360 

TGGTGGTTGG CCGGGCACTT GTCGTCGTCG TTGACGATCG CACGGCGCAC GGCGATGAAG 4 20 

ACCACAGCGG GCCGCTTGTC ACCGAGCTGC TCACCGAGGC CGGGTTTGTT GTCGACGGCG 4 80 

TGGTGGCGGT GTCGGCCGAC GAGGTCGAGA TCCGAAATGC GCTGAACACA GCGGTGATCG 54 0 

GCGGGGTGGA CCTGGTGGTG TCGGTCGGCG GGACCGGNGT GACGNCTCGC GATGTCACCC 600 

CGGAAGCCAC CCGNGACATT CT 622 

(2) INFORMATION FOR SEQ ID NO : 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base pairs 
(R) TYPE: nucloic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: lin^iar 

(XI) SEQUENi:E DESCRIPTION: SEQ ID NO : .1 1 : 

GGCGCAGCGG TAAi^CCTGTT GGCCGCCGGC ACACTCGTGT TGACAGCATG CGGCGGTGGC 60 

ACCAACAGCT CGTCGTCAGG CGCAGGCGGA ACGTCTGGGT CGGTGCACTG CGGCGGCAAG 120 

AAGGAGCTCC ACTCCAGCGG CTCGACCGCA CAAGAAAATG CCAT:^,GAGCA GTTCCTCTAT 180 
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GCCTACGTGC GATCGTGCCC GGGCTACACG TTGGACTACA ACGCCAACGG GTCCGGTGCC 24 0 

GGGGTGACCC AGTTTCTCAA CAACGAAACC GATTTCGCCG GCTCGGATGT CCCGTTGAAT 300 

CCGTCGACCG GTCAACCTGA CCGGTCGGCG GAGCGGTGCG GTTCCCCGGC ATGGGACCTG 3 60 

CCGACGGTGT TCGGCCCGAT CGCGATCACC TACAATATCA AGGGCGTGAG CACGCTGAAT 4 20 

CTTGACGGAC CCACTACCGC CAAGATTTTC AACGGCACCA TCACCGTGTG GAATGATCCA 4 80 

CAGATCCAAG CCCTCAACTC CGGCACCGAC CTGCCGCCAA CACCGATTAG CGTTATCTTC 54 0 

CGCAGCGACA AGTCCGGTAC GTCGGACAAC TTCCAGAAAT ACCTCGACGG TGTATCCAAC 60 0 

GGGGCGTGGG GCAAAGGCGC CAGCGAAACG TTCAGCGGGG GCGTCGGCGT CGGCGCCAGC 660 

GGGAACAACG GAACGTCGGC CCTACTGCAG ACGACCGACG GGTCGATCAC CTACAACGAG 72 0 

TGGTCGTTTG CGGTGGGTAA GCAGTTG7VAC ATGGCCCAGA TCATCACGTC GGCGGGTCCG 780 

GATCCAGTGG CGATCACCAC CGAGTCGGTC GGTAAGACAA TCGCCGGGGC CAAGATCATG 84 0 

GGACAAGGCA ACGACCTGGT ATTGGACACG TCGTCGTTCT ACAGACCCAC CCAGCCTGGC 900 

TCTTACCCGA TCGTGCTGGC GACCTATGAG ATCGTCTGCT CGAAATACCC GGATGCGACG 960 

ACCGGTACTG CGGTAAGGGC GTTTATGCAA GCCGCGATTG GTCCAGGCCA AGAAGGCCTG 1020 

GACCAATACG GCTCCATTCC GTTGCCCAAA TCGTTCCAAG CAAAATTGGC GGCCGCGGTG 108 0 

AATGCTATTT CTTGACCTAG TGAAGGGAAT TCGACGGTGA GCGATGCCGT TCCGCAGGTA 114 0 

GGGTCGCAAT TTGGGCCGTA TCAGCTATTG CGGCTGCTGG GCCGAGGCGG GATGGGCGAG 1200 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1155) base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GCAAGCAGCT GCAGGTCGTG :;:tgttcga^:(:^ AACTGGGCAT GCCGAAGACC AAACGCACCA 60 

AGACCGGCTA CACCACGGAT GCCGACGCGC TGCAGTCGTT GTTCGACAAG ACCGGGCATC 120 

cgtttctgca acatctgctc gcccaccgcc; ACGTCACCCG GCTCAAGGTC ACCGTCGACG IBO 
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GGTTGCTCCA AGCGGTGGCC GCCGACGGCC GCATCCACAC CACGTTCAAC CAGACGATCG 24 0 

CCGCGACCGG CCGGCTCTCC TCGACCGAAC CCAACCTGCA GAACATCCCG ATCCGCACCG 300 

ACGCGGGCCG GCGGATCCGG GACGCGTTCG TGGTCGGGGA CGGTTACGCC GAGTTGATGA 360 

CGGCCGACTA CAGCCAGATC GAGATGCGGA TCATGGGGCA CCTGTCCGGG GACGAGGGCC 420 

TCATCGAGGC GTTCAACACC GGGGAGGACC TGTATTCGTT CGTCGCGTCC CGGGTGTTCG 4 80 

GTGTGCCCAT CGACGAGGTC ACCGGCGAGT TGCGGCGCCG GGTCAAGGCG ATGTCCTACG 54 0 

GGCTGGTTTA CGGGTTGAGC GCCTACGGCC TGTCGCAGCA GTTGAAAATC TCCACCGAGG 600 

AAGCCAACGA GCAGATGGAC GCGTATTTCG CCCGATTCGG CGGGGTGCGC GACTACCTGC 660 

GCGCCGTAGT CGAGCGGGCC CGCAAGGACG GCTACACCTC GACGGTGCTG GGCCGTCGCC 720 

GCTACCTGCC CGAGCTGGAC AGCAGCAACC GTCAAGTGCG GGAGGCCGCC GAGCGGGCGG 780 

CGCTGAACGC GCCGATCCAG GGCAGCGCGG CCGACATCAT CAAGGTGGCC ATGATCCAGG 84 0 

TCGAC7VAGGC GCTCAACGAG GCACAGCTGG CGTCGCGCAT GCTGCTGCAG GTCCACGACG 900 

AGCTGCTGTT CGAAATCGCC CCCGGTGAAC GCGAGCGGGT CGAGGCCCTG GTGCGCGACA 960 

AGATGGGCGG CGCTTACCCG CTCGACGTCC CGCTGGAGGT GTCGGTGGGC TACGGCCGCA 1020 

GCTGGGACGC GGCGGCGCAC TGAGTGCCGA GCGTGCATCT GGGGCGGGAA TTCGGCGATT 1080 

TTTCCGCCCT GAGTTCACGC TCGGCGCAAT CGGGACCGAG TTTGTCCAGC GTGTACCCGT 1140 

CGAGTAGCCT CGTCA 11 5S 

(2) INFORMATIOr] FOR SEQ ID NO: 13: 

fi) r.FOUENCF'. CHARACTERISTICS: 

(A) LENGTfi: 1771 base pairs 
(P) TYPE: nuL-:leic acid 
(C) STRANDEDNESS : single 
(L) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: ^EQ ID NO: 13: 

GAGCGCCGTC TGGTCTTTGA ACGGTTTTAC CGGTCGGCAT CGGCACGGGC GTTGCCGGGT 60 

TCr^r.GCCTCG GGTTGGCGAT CGTCAAACAG GTGGTGCTCA ACCACGGCGG ATTGCTGCGC 120 

ATCGAAGACA CCGACCCAGG CGGCCAGCCC CCTGGAACGT CGATTTACGT GCTGCTCCCC 180 
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GGCCGTCGGA TGCCGATTCC GCAGCTTCCC GGTGCGACGG CTGGCGCTCG GAGCACGGAC 24 0 

ATCGAGAACT CTCGGGGTTC GGCGAACGTT ATCTCAGTGG AATCTCAGTC CACGCGCGCA 300 

ACCTAGTTGT GCAGTTACTG TTGAAAGCCA CACCCATGCC AGTCCACGCA TGGCCAAGTT 360 

GGCCCGAGTA GTGGGCCTAG TACAGGAAGA GCAACCTAGC GACATGACGA ATCACCCACG 4 20 

GTATTCGCCA CCGCCGCAGC AGCCGGGAAC CCCAGGTTAT GCTCAGGGGC AGCAGCAAAC 4 BO 

GTACAGCCAG CAGTTCGACT GGCGTTACCC ACCGTCCCCG CCGCCGCAGC CAACCCAGTA 54 0 

CCGTCAACCC TACGAGGCGT TGGGTGGTAC CCGGCCGGGT CTGATACCTG GCGTGATTCC 600 

GACCATGACG CCCCCTCCTG GGATGGTTCG CCAACGCCCT CGTGCAGGCA TGTTGGCCAT 6 60 

CGGCGCGGTG ACGATAGCGG TGGTGTCCGC CGGCATCGGC GGCGCGGCCG CATCCCTGGT 7 20 

CGGGTTCAAC CGGGCACCCG CCGGCCCCAG CGGCGGCCCA GTGGCTGCCA GCGCGGCGCC IHO 

AAGCATCCCC GCAGCAAACA TGCCGCCGGG GTCGGTCGAA CAGGTGGCGG CCAAGGTGGT 84 0 

GCCCAGTGTC GTCATGTTGG AAACCGATCT GGGCCGCCAG TCGGAGGAGG GCTCCGGCAT 900 

CATTCTGTCT GCCGAGGGGC TGATCTTGAC CAACAACCAC GTGATCGCGG CGGCCGCCAA 96 0 

GCCTCCCCTG GGCAGTCCGC CGCCGAAAAC GACGGTAACC TTCTCTGACG GGCGGACCGC 102 0 

ACCCTTCACG GTGGTGGGGG CTGACCCCAC CAGTGATATC GCCGTCGTCC GTGTTCAGGG 1080 

CGTCTCCGGG CTCACCCCGA TCTCCCTGGG TTCCTCCTCG GACCTGAGGG TCGGTCAGCC 114 0 

GGTGCTGGCG ATCGGGTCGC CGCTCGGTTT GGAGGGCACC GTGACCACGG GGATCGTCAG 1200 

CGCTCTCAAC CGTCCAGTGT CGACGACCGG CGAGGCCGGC AACCAGAACA CCGTGCTGGA 12 60 

CGCCATTCAG ACCGACGCCG CGATCAACCC CGGTAACTCC GGGGGCGCGC TGGTGAACAT 132 0 

GAACGCTCAA CTCGTCGGAG TCAACTCGGC CATTGCCACG CTGGGCGCGG ACTCAGCCGA 13H0 

TGCGCAGAGC GGCTCGATCG GTCTCGGTTT TGCGATTCCA GTCGACCAGG CCAAGCGCAT 14 40 

CGCCGACGAG TTGATCAGCA CCGGCAAGGC GTCACATGCC TCCCTGGGTG TGCAGGTGAC 1500 

CAATGACAAA GACACCCCGG GCGCCAAGAT CGTCGAAGTA GTGGCCGGTG GTGCTGCCGC 1560 

GAACGCTGGA GTGCCGAAGG GCGTCGTTGT CACCAAGGTC GACGACCGCC CGATCAACAG 162 0 

CGCGGACGCG TTGGTTGCCG CCGTGCGGTC CAAAGCGCCG GGCGCCACGG TGGCGCTAAC 16R0 

CTTTCAGGAT CCCTCGGGCG GTAGCCGCAC AGTGCAA3TC ACCCTCGGCA AGGCGGAGCA 174 0 

GTGATGAAGG TCGCCGCGCA GTGTTCAAAG C 177] 
(2) INFORMATION FOR SEQ ID NO : 1 4 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LEl^GTH; 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CTCCACCGCG GTGGCGGCCG CTCTAGAACT AGTGGATCCC CCGGGCTGCA GGAATTCGGC 60 

ACGAGGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC GGAAGTATCG GTCCATGCCT 120 

AGCCCGGCGA CGGCGAGCGC CGGAATGGCG CGAGTGAGGA GGCGGGCAAT TTGGCGGGGC 180 

CCGGCGACGG CGAGCGCCGG AATGGCGCGA GTGAGGAGGC GGGCAGTCAT GCCCAGCGTG 240 

ATCCAATCAA CCTGCATTCG GCCTGCGGGC CCATTTGACA ATCGAGGTAG TGAGCGCAAA 300 

TGAATGATGG AAAACGGGCG GTGACGTCCG CTGTTCTGGT GGTGCTAGGT GCCTGCCTGG 360 

CGTTGTGGCT ATCAGGATGT TCTTCGCCGA AACCTGATGC CGAGGAACAG GGTGTTCCCG 4 20 

TGAGCCCGAC GGCGTCCGAC CCCGCGCTCC TCGCCGAGAT CAGGCAGTCG CTTGATGCGA 4 80 

CA7Wy\GGGTT GACCAGCGTG CACGTAGCGG TCCGAACAAC CGGGAAAGTC GACAGCTTGC 54 0 

TGGGTATTAC CAGTGCCGAT GTCGACGTCC GGGCCAATCC GCTCGCGGCA AAGGGCGTAT 600 

GCACCTACAA CGACGAGCAG GGTGTCCCGT TTCGGGTACA AGGCGACAAC ATCTCGGTGA 6t)0 

AACTGTTCGA CGACTGGAGC AATCTCGGCT CGATTTCTGA ACTGTCAACT TCACGCGTGC 120 

TCGATCCTGC CGCTGGGGTG ACGCAGCTGC TGTCCGGTGT CACGA/vCCTC CAAGCGCAAG 7 8') 

GTACCGAAGT GATAGACGGA ATTTCGACCA CCAAAATCAC CGGGACCATC CCCGCGAGCT 84 0 

CTGTCAAGAT GCTTGATCCT GGCGCCAAGA GTGCAAGGCC GGCGACCGTG TGGATTGCCC 900 

AGGACGGCTC GCACCACCTC GTCCGAGCGA GCATCGACCT CGGATCCGGG TCGATTCAGC 960 

TCACGCAGTC GAAATGGAAC GAACCCGTCA ACGTCGACTA GGCCGAAGTT GCGTCGACGC 102 0 

GTTGNTCGAA ACGCCCTTGT GAA'^GGTGTC AACGGNAC lO-'^S 
(2) INFORMATION FOR SE'.) 10 NO: IS: 

(i) SEQUENCE CHAR^iCT^.RISTICS: 

(A) LENGTH: S42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(0) TOPOLOGY: linear 

(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GAATTCGGCA CGAGAGGTGA TCGACATCAT CGGGACCAGC CCCACATCCT GGGAACAGGC 60 

GGCGGCGGAG GCGGTCCAGC GGGCGCGGGA TAGCGTCGAT GACATCCGCG TCGCTCGGGT 120 

CATTGAGCAG GACATGGCCG TGGACAGCGC CGGCAAGATC ACCTACCGCA TCAAGCTCGA 180 

AGTGTCGTTC AAGATGAGGC CGGCGCAACC GCGCTAGCAC GGGCCGGCGA GCAAGACGCA 24 0 

AAATCGCACG GTTTGCGGTT GATTCGTGCG ATTTTGTGTC TGCTCGCCGA GGCCTACCAG 300 

GCGCGGCCCA GGTCCGCGTG CTGCCGTATC CAGGCGTGCA TCGCGATTCC GGCGGCCACG 3 60 

CCGGAGTTAA TGCTTCGCGT CGACCCGAAC TGGGCGATCC GCCGGNGAGC TGATCGATGA 4 20 

CCGTGGCCAG CCCGTCGATG CCCGAGTTGC CCGAGGAAAC GTGCTGCCAG GCCGGTAGGA 480 

AGCGTCCGTA GGCGGCGGTG CTGACCGGCT CTGCCTGCGC CCTCAGTGCG GCCAGCGAGC 5 40 
GG 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 913 base pairs 
in) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

[xi] SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CGGTGCCGCC CGCGCCTCCG TTGCCCCCAT TGCCGCCGTC GCCGATCAGC TGCGCATCGC 60 

CACCATCACC GCCTTTGCCG CCGGCACCGC CGGTGGCGCC GGGGCCGCCG ATGCCACCGC 120 

TTGACCCTGG CCGCCGGCGC CGCCATTGCC ATACAGCACC CCGCCGGGGG CACCGTTACC 180 

GCCGTCGCCA CCGTCGCCGC CGCTGCCGTT TCAGGCCGGG GAGGCCGAAT G7\ACCGCCGC 24 0 

CAAC^,CCCGCC GGCGi:;CACCG ttgci:gcctt TTCCGCCCGC CCCGCCGGCG CCGCCAATTG 300 

CCGAACAGGC AMGCACCGTT GCCGCCAGCC CCGCCGCCGT TAACGGCGCT GCCGGGCGCC 3 60 

GCCGCCGGAC CCGCCATTAC CGCCGTTCCC GTTCGGTGCC CCGCCGTTAC CGGCGCCGCC 4 20 
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GTTTGCCGCC AATATTCGGC GGGCACCGCC AGACCCGCCG GGGCCACCAT TGCCGCCGGG 4 80 

CACCGAAACA ACAGCCCAAC GGTGCCGCCG GCCCCGCCGT TTGCCGCCAT CACCGGCCAT 540 

TCACCGCCAG CACCGCCGTT AATGTTTATG AACCCGGTAC CGCCAGCGCG GCCCCTATTG 600 

CCGGGCGCCG GAGNGCGTGC CCGCCGGCGC CGCCAACGCC CA7\AAGCCCG GGGTTGCCAC 660 

CGGCCCCGCC GGACCCACCG GTCCCGCCGA TCCCCCCGTT GCCGCCGGTG CCGCCGCCAT 720 

TGGTGCTGCT GAAGCCGTTA GCGCCGGTTC CGCSGGTTCC GGCGGTGGCG CCNTGGCCGC 780 

CGGCCCCGCC GTTGCCGTAC AGCCACCCCC CGGTGGCGCC GTTGCCGCCA TTGCCGCCAT 8 40 

TGCCGCCGTT GCCGCCATTG CCGCCGTTCC CGCCGCCACC GCCGGNTTGG CCGCCGGCGC 900 

CGCCGGCGGC CGC 913 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQDENCE CHARACTERISTICS: 

(A) LENGTH: 1872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GACTACGTTG GTGTAGAAAA ATCCTGCCGC CCGGACCCTT AAGGCTGGGA CAATTTCTGA 60 

TAGCTACCCC GACACAGGAG GTTACGGGAT GAGCAATTCG CGCCGCCGCT CACTCAGGTG 12 0 

GTCATGGTTG CTGAGCGTGC TGGCTGCCGT CGGGCTGGGC CTGGCCACGG CGCCGGCCCA 180 

GGCGGCCCCG CCGGCCTTGT CGCAGGACCG GTTCGCCGAC TTCCCCGCGC TGCCCCTCGA 24 0 

CCCGTCCGCG ATGGTCGCCC AAGTGGCGCC ACAGGTGGTC AACATCAACA CCAAACTGGG 300 

CTACAACAAC GCCGTGGGCG CCGGGACCGG CATCGTCATC GATCCCAACG GTGTCGTGCT 360 

GACCAACAAC CACGTGATCG CGGGCGCCAC C'^ACATCAAT GCGTTCAGCG TCGGCTCCGG 4 20 

CCAAACCTAC GGCGTCGATG TGGTCGGGTA TGA':CGCACC CAGGATGTCG CGGTGCTGCA 4 80 

GCTGCGCGGT r^CCGGTGGCC TGCCGTCGGC GGCr^ATCGGT GGGGGCGTITG CGGTTG3TGA b4 0 

GCCCGTCGTC -^CGATGGGCA ACAi:^CGGTGG GCAGGGCGGA ACGCCCCGTG CGGTGCCTGG bOO 

cagggtggt:: '^CGCTCGGCC AAJ\CCGTGCA GGCGTCGGAT TCGCTGACCG GTGCCGTVAGA 660 
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GACATTGAAC GGGTTGATCC AGTTCGATGC CGCAATCCAG CCCGGTGATT CGGGCGGGCC 7 20 

CGTCGTCAAC GGCCTAGGAC AGGTGGTCGG TATGAACACG GCCGCGTCCG ATAACTTCCA 7 80 

GCTGTCCCAG GGTGGGCAGG GATTCGCCAT TCCGATCGGG CAGGCGATGG CGATCGCGGG 84 0 

CCAAATCCGA TCGGGTGGGG GGTCACCCAC CGTTCATATC GGGCCTACCG CCTTCCTCGG 900 

CTTGGGTGTT GTCGACAACA ACGGCAACGG CGCACGAGTC CAACGCGTGG TCGGAAGCGC 960 

TCCGGCGGCA AGTCTCGGCA TCTCCACCGG CGACGTGATC ACCGCGGTCG ACGGCGCTCC 1020 

GATCAACTCG GCCACCGCGA TGGCGGACGC GCTTAACGGG CATCATCCCG GTGACGTCAT 1080 

CTCGGTGAAC TGGCAAACCA AGTCGGGCGG CACGCGTACA GGGAACGTGA CATTGGCCGA 1140 

GGGACCCCCG GCCTGATTTG TCGCGGATAC CACCCGCCGG CCGGCCAATT GGATTGGCGC 1200 

CAGCCGTGAT TGCCGCGTGA GCCCCCGAGT TCCGTCTCCC GTGCGCGTGG CATTGTGG7VA 12 60 

GCAATGAACG AGGCAGAACA CAGCGTTGAG CACCCTCCCG TGCAGGGCAG TTACGTCGAA 1320 

GGCGGTGTGG TCGAGCATCC GGATGCCAAG GACTTCGGCA GCGCCGCCGC CCTGCCCGCC 138 0 

GATCCGACCT GGTTTAAGCA CGCCGTCTTC TACGAGGTGC TGGTCCGGGC GTTCTTCGAC 14 4 0 

GCCAGCGCGG ACGGTTCCGN CGATCTGCGT GGACTCATCG ATCGCCTCGA CTACCTGCAG 1500 

TGGCTTGGCA TCGACTGCAT CTGTTGCCGC CGTTCCTACG ACTCACCGCT GCGCGACGGC 1560 

GGTTACGACA TTCGCGACTT CTACAAGGTG CTGCCCGAAT TCGGCACCGT CGACGATTTC 1620 

GTCGCCCTGG TCGACACCGC TCACCGGCGA GGTATCCGCA TCATCACCGA CCTGGTGATG 168 0 

AATCACACCT CGGAGTCGCA CCCCTGGTTT CAGGAGTCCC GCCGCGACCC AGACGGACCG 174 0 

TACGGTGACT ATTACGTGTG GAGCGACACC AGCGAGCGCT ACACCGACGC CCGGATCATC 1800 

TTCGTCGACA CCGAAGAGTC GAACTGGTCA TTCGATCCTG TCCGCCGACA GTTNCTACTG 18 60 

GCACCGATTC TT 1872 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1482 base pairs 
(R) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 8 : 
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CTTCGCCGAA ACCTGATGCC GAGGAACAGG GTGTTCCCGT GAGCCCGACG GCGTCCGACC 60 

CCGCGCTCCT CGCCGAGATC AGGCAGTCGC TTGATGCGAC AAAAGGGTTG ACCAGCGTGC 12 0 

ACGTAGCGGT CCG7\ACAACC GGGAAAGTCG ACAGCTTGCT GGGTATTACC AGTGCCGATG 180 

TCGACGTCCG GGCCAATCCG CTCGCGGCAA AGGGCGTATG CACCTACAAC GACGAGCAGG 24 0 

GTGTCCCGTT TCGGGTACAA GGCGACAACA TCTCGGTGAA ACTGTTCGAC GACTGGAGCA 30 0 

ATCTCGGCTC GATTTCTGAA CTGTCAACTT CACGCGTGCT CGATCCTGCC GCTGGGGTGA 360 

CGCAGCTGCT GTCCGGTGTC ACGAACCTCC AAGCGCAAGG TACCGAAGTG ATAGACGGAA 4 20 

TTTCGACCAC CAAAATCACC GGGACCATCC CCGCGAGCTC TGTCAAGATG CTTGATCCTG 4 80 

GCGCCAAGAG TGCAAGGCCG GCGACCGTGT GGATTGCCCA GGACGGCTCG CACCACCTCG 54 0 

TCCGAGCGAG CATCGACCTC GGATCCGGGT CGATTCAGCT CACGCAGTCG AAATGGAACG 600 

AACCCGTCM CGTCGACTAG GCCGAAGTTG CGTCGACGCG TTGCTCGAAA CGCCCTTGTG 6 60 

AACGGTGTCA ACGGCACCCG AAAACTGACC CCCTGACGGC ATCTGAAAAT TGACCCCCTA 120 

GACCGGGCGG TTGGTGGTTA TTCTTCGGTG GTTCCGGCTG GTGGGACGCG GCCGAGGTCG 78 0 

CGGTCTTTGA GCCGGTAGCT GTCGCCTTTG AGGGCGACGA CTTCAGCATG GTGGACGAGG 84 0 

CGGTCGATCA TGGCGGCAGC AACGACGTCG TCGCCGCCGA AAACCTCGCC CCACCGGCCG 900 

AAGGCCTTAT TGGACGTGAC GATCAAGCTG GCCCGCTCAT ACCGGGAGGA CACCAGCTGG 960 

AAGAAGAGGT TGGCGGCCTC GGGCTCAAAC GGAATGTAAC CGACTTCGTC AACCACCAGG 10;^ 0 

AGCGGATAGC GGCCAAACCG GGTGAGTTCG GCGTAGATGC GCCCGGCGTG GTGAGCCTCG 108 0 

GCGAACCGTG CTACCCATTC GGCGGCGGTG GCGAACAGCA CCCGATGACC GGCCTGACAC 114 0 

GCGCGTATCG CCAGGCCGAC CGCAAGATGA GTCTTCCCG& TGCCAGGCGG GGCCCAAAAA 1200 

CACGACGTTA TCGCGGGCGG TGATGAAATC CAGGGTGCCC AGATGTGCGA TGGTGTCGCG 12 60 

TTTGAGGCCA CGAGCATGCT CAAAGTCGAA CTCTTCCAAC GACTTCCGAA CCGGGAAGCG 1320 

GGCGGCGCGG ATGCGGCCCT CACCACCATG GGACTCCCGG GCTGACACTT CCCGCTGCAG 138 0 

GCAGGCGGCC AGGTATTCTT CGTGGCTCCA GTTCTCGGCG CGGGCGCGAT CGGCCAGCCG 144 0 

GGACACTGAC TCACGCAGGG TGGGAGCTTT CAATGCTCTT GT 1482 

{2) INFORMATION FOR SEQ ID NO : 1 9 : 

(i) sequence: CHARACTERISTICS: 

[A) LENGTH: 876 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GAATTCGGCA CGAGCCGGCG ATAGCTTCTG GGCCGCGGCC GACCAGATGG CTCGAGGGTT 60 

CGTGCTCGGG GCCACCGCCG GGCGCACCAC CCTGACCGGT GAGGGCCTGC AACACGCCGA 120 

CGGTCACTCG TTGCTGCTGG ACGCCACCAA CCCGGCGGTG GTTGCCTACG ACCCGGCCTT 180 

CGCCTACGAA ATCGGCTACA TCGNGGAAAG CGGACTGGCC AGGATGTGCG GGGAGAACCC 240 

GGAGAACATC TTCTTCTACA TCACCGTCTA CAACGAGCCG TACGTGCAGC CGCCGGAGCC 300 

GGAG7VACTTC GATCCCGAGG GCGTGCTGGG GGGTATCTAC CGNTATCACG CGGCCACCGA 3 60 

GCAACGCACC AACAAGGNGC AGATCCTGGC CTCCGGGGTA GCGATGCCCG CGGCGCTGCG 4 20 

GGCAGCACAG ATGCTGGCCG CCGAGTGGGA TGTCGCCGCC GACGTGTGGT CGGTGACCAG 480 

TTGGGGCGAG CTAAACCGCG ACGGGGTGGT CATCGAGACC GAGAAGCTCC GCCACCCCGA 54 0 

TCGGCCGGCG GGCGTGCCCT ACGTGACGAG AGCGCTGGAG AATGCTCGGG GCCCGGTGAT 600 

CGCGGTGTCG GACTGGATGC GCGCGGTCCC CGAGCAGATC CGACCGTGGG TGCCGGGCAC 6 60 

ATACCTCACG TTGGGCACCG ACGGGTTCGG TTTTTCCGAC ACTCGGCCCG CCGGTCGTCG 720 

TTACTTCAAC ACCGACGCCG AATCCCAGGT TGGTCGCGGT TTTGGGAGGG GTTGGCCGGG 780 

TCGACGGGTG AATATCGACC CATTCGGTGC CGGTCGTGGG CCGCCCGCCC AGTTACCCGG 84 0 

ATTCGACGAA GGTGGGGGGT TGCGCCCGAN TAAGTT 87 6 

(2) INF0R^4ATI0^] FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 
{B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:20: 
ATCCCCCCGG GCTGCAGGAA TTCGGCACGA GAGACAAAAT TCCACGCGTT AATGCAGGAA 60 
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CAGATTCATA ACGAATTCAC AGCGGCACAA CAATATGTCG CGATCGCGGT TTATTTCGAC 120 

AGCGAAGACC TGCCGCAGTT GGCGAAGCAT TTTTACAGCC AAGCGGTCGA GGAACGAAAC 180 

CATGCAATGA TGCTCGTGCA ACACCTGCTC GACCGCGACC TTCGTGTCGA AATTCCCGGC 24 0 

GTAGACACGG TGCGAAACCA GTTCGACAGA CCCCGCGAGG CACTGGCGCT GGCGCTCGAT 300 

CAGGAACGCA CAGTCACCGA CCAGGTCGGT CGGCTGACAG CGGTGGCCCG CGACGAGGGC 3 60 

GATTTCCTCG GCGAGCAGTT CATGCAGTGG TTCTTGCAGG AACAGATCGA AGAGGTGGCC 4 20 

TTGATGGCAA CCCTGGTGCG GGTTGCCGAT CGGGCCGGGG CCAACCTGTT CGAGCTAGAG 4 80 

AACTTCGTCG CACGTGAAGT GGATGTGGCG CCGGCCGCAT CAGGCGCCCC GCACGCTGCC 54 0 

GGGGGCCGCC TCTAGATCCC TGGGGGGGAT CAGCGAGTGG TCCCGTTCGC CCGCCCGTCT 600 

TCCAGCCAGG CCTTGGTGCG GCCGGGGTGG TGAGTACCAA TCCAGGCCAC CCCGACCTCC 660 

CGGNAAAAGT CGATGTCCTC GTACTCATCG ACGTTCCAGG AGTACACCGC CCGGCCCTGA 720 

GCTGCCGAGC GGTCAACGAG TTGCGGATAT TCCTTTAACG CAGGCAGTGA GGGTCCCACG 78 0 

GCGGTTGGCC CGACCGCCGT GGCCGCACTG CTGGTCAGGT ATCGGGGGGT CTTGGCGAGC 84 0 

AACAACGTCG GCAGGAGGGG TGGAGCCCGC CGGATCCGCA GACCGGGGGG GCGAAAACGA 90 0 

CATCAACACC GCACGGGATC GATCTGCGGA GGGGGGTGCG GGAATACCGA ACCGGTGTAG 960 

GAGCGCCAGC AGTTGTTTTT CCACCAGCGA AGCGTTTTCG GGTCATCGGN GGCNNTTAAG 1020 

T 1021 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CGTGCCGACG AACGGAAGAA CACAACCATG AAGATGGTGA AATCGATCGC CGCAGGTCTG 60 

ACCGCCGCGG CTGCAATCGG CGCCGCTGCG GCCGGTGTGA CTTCGATCAT GGCTGGCGGN 120 

CCGGTCGTAT ACCAGATGCA GCCGGTCGTC TTCGGCGCGC CACTGCCGTT GGACCCGGNA 180 
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TCCGCCCCTG ANGTCCCGAC CGCCGCCCAG TGGACCAGNC TGCTCAACAG NCTCGNCGAT 24 0 

CCCAACGTGT CGTTTGNGAA CAAGGGNAGT CTGGTCGAGG GNGGNATCGG NGGNANCGAG 300 

GGNGNGNATC GNCGANCACA A 321 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

TCTTATCGGT TCCGGTTGGC GACGGGTTTT GGGNGCGGGT GGTTAACCCG CTCGGCCAGC 60 

CGATCGACGG GCGCGGAGAC GTCGACTCCG ATACTCGGCG CGCGCTGGAG CTCCAGGCGC 120 

CCTCGGTGGT GNACCGGCAA GGCGTGAAGG AGCCGTTGNA GACCGGGATC AAGGCGATTG 180 

ACGCGATGAC CCCGATCGGC CGCGGGCAGC GCCAGCTGAT CATCGGGGAC CGCAAGACCG 24 0 

GCAAAAACCG CCGTCTGTGT CGGACACCAT CCTCAAACCA GCGGGAAGAA CTGGGAGTCC 300 

GGTGGATCCC AAGAAGCAGG TGCGCTTGTG TATACGTTGG CCATCGGGCA AGAAGGGGAA 360 

CTTACCATCG CCG 37 3 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

GTGACGCCGT ■TrATGGGATTC CTGGGCGGGG CCGGTCCGCT Gf^CGi^.TGGTG GATCAGCAAC 60 

TGGTTACCCG :-GTGCCGC/\A GGCTGGTCGT TTGCTCAGGC Ar^CCGCTGTG CCGGTGGTGT 120 

TCTTGACGGC CTGGTACGGG TTGGCCGATT TAGCCGAGAT CAAGGCGGGC GAATCGGTGC 18 0 

TGATCCATGC CGGTACCGGC GGTGTGGGCA TGGCGGCTGT GCAGCTGGCT CGCCAGTGGG 24 0 
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GCGTGGAGGT TTTCGTCACC GCCAGCCGTG GNAAGTGGGA CACGCTGCGC GCCATNGNGT 300 
TTGACGACGA NCCATATCGG NGATTCCCNC ACATNCGAAG TTCCGANGGA GA 352 
(2) INFORMATION FOR SEQ ID NO: 24: 

{i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GAAATCCGCG TTCATTCCGT TCGACCAGCG GCTGGCGATA ATCGACGAAG TGATCAAGCC 60 

GCGGTTCGCG GCGCTCATGG GTCACAGCGA GTAATCAGCA AGTTCTCTGG TATATCGCAC 12 0 

CTAGCGTCCA GTTGCTTGCC AGATCGCTTT CGTACCGTCA TCGCATGTAC CGGTTCGCGT 180 

GCCGCACGCT CATGCTGGCG GCGTGCATCC TGGCCACGGG TGTGGCGGGT CTCGGGGTCG 24 0 

GCGCGCAGTC CGCAGCCCAA ACCGCGCCGG TGCCCGACTA CTACTGGTGC CCGGGGCAGC 300 

CTTTCGACCC CGCATGGGGG CCCAACTGGG ATCCCTACAC CTGCCATGAC GACTTCCACC 360 

GCGACAGCGA CGGCCCCGAC CACAGCCGCG ACTACCCCGG ACCCATCCTC GAAGGTCCCG 4 20 

TGCTTGACGA TCCCGGTGCT GCGCCGCCGC CCCCGGCTGC CGGTGGCGGC GCATAGCGCT 4 80 

CGTTGACCGG GCCGCATCAG CGAATACGCG TATAAACCCG GGCGTGCCCC CGGCAAGCTA 54 0 

CGACCCCCGG CGGGGCAGAT TTACGCTCCC GTGCCGATGG ATCGCGCCGT CCGATGACAG 600 

AAAATAGGCG ACGGTTTTGG CAACCGCTTG GAGGACGCTT GAAGGGAACC TGTCATGAAC 6 60 

GGCGACAGCG CCTCCACCAT CGACATCGAC TVAGGTTGTTA CCCGCACACC CGTTCGCCGG 720 

ATCGTG 72 6 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 base pairs 
(H) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

CGCGACGACG ACGAACGTCG GGCCCACCAC CGCCTATGCG TTGATGCAGG CGACCGGGAT 60 

GGTCGCCGAC CATATCCAAG CATGCTGGGT GCCCACTGAG CGACCTTTTG ACCAGCCGGG 120 

CTGCCCGATG GCGGCCCGGT GAAGTCATTG CGCCGGGGCT TGTGCACCTG ATGAACCCGA 180 

ATAGGGAACA ATAGGGGGGT GATTTGGCAG TTCAATGTCG GGTATGGCTG GAAATCCAAT 24 0 

GGCGGGGCAT GCTCGGCGCC GACCAGGCTC GCGCAGGCGG GCCAGCCCGA ATCTGGAGGG 300 

AGCACTCAAT GGCGGCGATG AAGCCCCGGA CCGGCGACGG TCCTTTGGAA GCAACTAAGG 360 

AGGGGCGCGG CATTGTGATG CGAGTACCAC TTGAGGGTGG CGGTCGCCTG GTCGTCGAGC 4 20 

TGACACCCGA CGAAGCCGCC GCACTGGGTG ACGAACTCAA AGGCGTTACT AGCTAAGACC 4 80 

AGCCCAACGG CGAATGGTCG GCGTTACGCG CACACCTTCC GGTAGATGTC CAGTGTCTGC 54 0 

TCGGCGATGT ATGCCCAGGA GAACTCTTGG ATACAGCGCT 580 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

AACGGAGGCG CCGGGGGTTT TGGCGGGGCC GGGGCGGTCG GCGGCAACGG CGGGGCCGGC 60 

GGTACCGCCG GGTTGTTCGG TGTCGGCGGG GCCGGTGGGG CCGGAGGCAA CGGCATCGCC 12 0 

GGTGTCACGG GTACGTCGGC CAGCACACCG GGTGGATCCG 160 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 



GACACCGATA CGATGGTGAT GTACGCCAAC GTTGTCGACA CGCTCGAGGC GTTCACGATC 



60 



CAGCGCACAC CCGACGGCGT GACCATCGGC GATGCGGCCC CGTTCGCGGA GGCGGCTGCC 



120 



AAGGCGATGG GAATCGACAA GCTGCGGGTA ATTCATACCG GAATGGACCC CGTCGTCGCT 



180 



GAACGCGAAC AGTGGGACGA CGGCAACAAC ACGTTGGCGT TGGCGCCCGG TGTCGTTGTC 



240 



GCCTACGAGC GCAACGTACA GACCAACGCC CG 



272 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 317 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

GCAGCCGGTG GTTCTCGGAC TATCTGCGCA CGGTGACGCA GCGCGACGTG CGCGAGCTGA 60 

AGCGGATCGA GCAGACGGAT CGCCTGCCGC GGTTCATGCG CTACCTGGCC GCTATCACCG 120 

CGCAGGAGCT GAACGTGGCC GAAGCGGCGC GGGTCATCGG GGTCGACGCG GGGACGATCC 18 0 

GTTCGGATCT GGCGTGGTTC GAGACGGTCT ATCTGGTACA TCGCCTGCCC GCCTGGTCGC 24 0 

GGAATCTGAC CGCGAAGATC AAGAAGCGGT CAAAGATCCA CGTCGTCGAC AGTGGCTTCG 300 

CGGCCTGGTT GCGCGGG 317 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
GATCGTGGAG CTGTCGATGA ACAGCGTTGC CGGACGCGCG GCGGCCAGCA CGTCGGTGTA 60 
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GCAGCGCCGG ACCACCTCGC CGGTGGGCAG CATGGTGATG ACCACGTCGG CCTCGGCCAC 



120 



CGCTTCGGGC GCGCTACGAA ACACCGCGAC ACCGTGCGCG GCGGCGCCGG ACGCCGCCGT 



180 



GG 



182 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 308 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

GATCGCGAAG TTTGGTGAGC AGGTGGTCGA CGCGAAAGTC TGGGCGCCTG CGAAGCGGGT 60 

CGGCGTTCAC GAGGCGAAGA CACGCCTGTC CGAGCTGCTG CGGCTCGTCT ACGGCGGGCA 120 

GAGGTTGAGA TTGCCCGCCG CGGCGAGCCG GTAGCAAAGC TTGTGCCGCT GCATCCTCAT 180 

GAGACTCGGC GGTTAGGCAT TGACCATGGC GTGTACCGCG TGCCCGACGA TTTGGACGCT 24 0 

CCGTTGTCAG ACGACGTGCT CGAACGCTTT CACCGGTGAA GCGCTACCTC ATCGACACCC 300 

ACGTTTGG 308 

(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

CCGACGACGA GCAACTCACG TGGATGATGG TCGGCAGCGG CATTGAGGAC GGAGAGAATC bO 

CGGCCGAAGC TGCCGCGCGG CAAGTGCTCA TAGTGACCGG CCGTAGAGGG CTCCCCCGAT 120 

GGCACCGGAC TATTCTGGTG TGCCGCTGGC CGGTAAGAGC GGGTAAAAGA ATGTGAGGGG 180 

ACACGATGAG CAATCACACC TACCGAGTGA TCGAGATCGT CGGGACCTCG CCCGACGGCG 24 0 

TCGACGCGGC AATCCAGGGC GGTCTGG 2 67 
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(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1539 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

CTCGTGCCGA AAGAATGTGA GGGGACACGA TGAGCAATCA CACCTACCGA GTGATCGAGA 60 

TCGTCGGGAC CTCGCCCGAC GGCGTCGACG CGGCAATCCA GGGCGGTCTG GCCCGAGCTG 120 

CGCAGACCAT GCGCGCGCTG GACTGGTTCG AAGTACAGTC AATTCGAGGC CACCTGGTCG 180 

ACGGAGCGGT CGCGCACTTC CAGGTGACTA TGAAAGTCGG CTTCCGCTGG AGGATTCCTG 24 0 

AACCTTCAAG CGCGGCCGAT AACTGAGGTG CATCATTAAG CGACTTTTCC AGAACATCCT 300 

GACGCGCTCG AAACGCGGTT CAGCCGACGG TGGCTCCGCC GAGGCGCTGC CTCCAAAATC 3 60 

CCTGCGACAA TTCGTCGGCG GCGCCTACAA GGAAGTCGGT GCTGAATTCG TCGGGTATCT 4 20 

GGTCGACCTG TGTGGGCTGC AGCCGGACGA AGCGGTGCTC GACGTCGGCT GCGGCTCGGG 480 

GCGGATGGCG TTGCCGCTCA CCGGCTATCT GAACAGCGAG GGACGCTACG CCGGCTTCGA 54 0 

TATCTCGCAG AAAGCCATCG CGTGGTGCCA GGAGCACATC ACCTCGGCGC ACCCCAACTT 60 0 

CCAGTTCGAG GTCTCCGACA TCTACAACTC GCTGTACAAC CCGAAAGGGA AATACCAGTC 660 

ACTAGACTTT CGCTTTCCAT ATCCGGATGC GTCGTTCGAT GTGGTGTTTC TTACCTCGGT 72 0 

GTTCACCCAC ATGTTTCCGC CGGACGTGGA GCACTATCTG GACGAGATCT CCCGCGTGCT 7R0 

GAAGCCCGGC GGACGATGCC TGTGCACGTA CTTCTTGCTC AATGACGAGT CGTTAGCCCA 8 40 

CATCGCGGAA GGAAAGAGTG CGCACAACTT CCAGCATGAG GGACCGGGTT ATCGGACAAT 900 

CCACAAGAAG CGGCCCGAAG AAGCAATCGG TTTGCCGGAG ACCTTCGTCA GGGATGTCTA 960 

TGGCAAGTTC GGCCTCGCCG TGCACGA^ACC ATTGCACTAC GGCTCATGGA GTGGCCGGGA 1020 

ACCACGCCTA AGCTTCCAGG ACATCGT^rAT GGCGACCAAA ACCGCGAGCT AG'I^TGGGCAT 10^0 

CCGGGAAGCA TCGCGACACC GTGGCGCCGA ^CGCCGCTGC CGGCAGGGCG ATTAGGCGGG 114 0 

CAGATTAGCC CGCCGCGGCT CCCGGCTCCG AGTACGGCGC CCCGAATi^i:;C GTCACCGGCT 1200 

GGTAACCACG CTTGCGC3CC TGGGCGGCGG GCTGrCGGAT CAGGTGGTAG ATGCCGACAA 1260 
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AGCCTGCGTG ATCGGTCATC ACCAACGGTG ACAGCAGCCG GTTGTGCACC AGCGCGAJVCG 1320 

CCACCCCGGT CTCCGGGTCT GTCCAGCCGA TCGAGCCGCC CAAGCCCACA TGACCAAACC 1380 

CCGGCATCAC GTTGCCGATC GGCATACCGT GATAGCCAAG ATGAAAATTT AAGGGCACCA 14 4 0 

ATAGATTTCG ATCCGGCAGA ACTTGCCGTC GGTTGCGGGT CAGGCCCGTG ACCAGCTCCC 1500 

GCGACAAGAA CCGTATGCCG TCGATCTCGC CTCGTGCCG 1539 
{?) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 851 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

CTGCAGGGTG GCGTGGATGA GCGTCACCGC GGGGCAGGCC GAGCTGACCG CCGCCCAGGT 60 

CCGGGTTGCT GCGGCGGCCT ACGAGACGGC GTATGGGCTG ACGGTGCCCC CGCCGGTGAT 120 

CGCCGAGAAC CGTGCTGAAC TGATGATTCT GATAGCGACC AACCTCTTGG GGCAAAACAC 180 

CCCGGCGATC GCGGTCAACG AGGCCGAATA CGGCGAGATG TGGGCCCAAG ACGCCGCCGC 24 0 

GATGTTTGGC TACGCCGCGG CGACGGCGAC GGCGACGGCG ACGTTGCTGC CGTTCGAGGA 300 

GGCGCCGGAG ATGACCAGCG CGGGTGGGCT CCTCGAGCAG GCCGCCGCGG TCGAGGAGGC 3 60 

CTCCGACACC GCCGCGGCGA ACCAGTTGAT GAACAATGTG CCCCAC^GCGC TGAAACAGTT 4 20 

GGCCCAGCCC ACGCAGGGCA CCACGCCTTC TTCCAAGCTG GGTGGCCTGT GGAAGACGGT 4 80 

CTCGCCGCAT CGGTCGCCGA TCAGCAACAT GGTGTCGATG GCCAACAACC ACATGTCGAT 54 0 

GACCAACTCC GGTGTGTCGA TGACC7VACAC CTTGAGCTCG ATGTTGAAGG GCTTTGCTCC 600 

CGCGGCGGCC GCCCAGGCCG TGC7\7\ACCGC i:GCGCAAAAC GGGGTCCGGG CGATGAGCTC 6 60 

GCTGGGCAGC TCGCTGGGTT CTTCGGGTCT GGGCGGTGGG GTGGCCGCCA ACTTGGGTCG 720 

GGCGGCCTCG GTACGGTATG GTCACCGGGA TGGCGGAAAA TATGCANAGT CTGGTCGGCG 7 80 

GAACGGTGGT CCGGCGTAAG GTTTACCCCG GTTTTCTGGA TGCGGTGAAC TTCGTCAACG 84 0 

GAAACAGTTA C 8 51 
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(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25^3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GATCGATCGG GCGGAAATTT GGACCAGATT CGCCTCCGGC GATAACCCM TCAATCGAAC 60 

CTAGATTTAT TCCGTCCAGG GGCCCGAGTA ATGGCTCGCA GGAGAGGAAC CTTACTGCTG 120 

CGGGCACCTG TCGTAGGTCC TCGATACGGC GGAAGGCGTC GACATTTTCC ACCGACACCC 180 

CCATCCAAAC GTTCGAGGGC CACTCCAGCT TGTGAGCGAG GCGACGCAGT CGCAGGCTGC 24 0 

GCTTGGTCAA GATC 2 54 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GATCCTGACC GAAGCGGCCG '^CGCCAAGGC GAAGTCGCTG TTGGACCAGG AGGGACGGGA bO 

CGATCTGGCG CTGCGGATCG CGGTTCAGCC GGGGGGGTGC GCTGGATTGC GCTATAACCT 120 

TTTCTTCGAC GACCGGACGC TGGATGGTGA CCAAACCGCG GAGTTCGGTG GTGTCAGGTT 180 

GATCGTGGAC CGGAl'GAGCG CGCCGTATGT GGAAGGCGCG TCGATCGATT TCGTCGACAC 24 0 

TATTGAGAAG CA.^GGTTCAC CATCGACAAT CCCAACGCCA CCGGCTCCTG CGCGTGCGGG 300 

GATTCGTTCA ACTGATA.Q^-\ CGCTAGTACG ACCCCGGGl^T GCG'.:AAGACG TACGAGCACA 3b0 

cc7vagacctc accgcgctgg afw^agc/vact gagcgatgci; ttgcacgtga ccgcgtggcg 4 20 
ggccgccggc ggca':^gtgtc acgtgcatgg tgaacagcac ctgi^gcctga tattgcgacc 4 ho 

AGTACACGAT TTTGTCGATC GAGGTCACTT CGACCTGGGA GAACTGCTTG CGGAACGCGT 540 



wo 98/16645 



85 



PCT/US97/18214 



CGCTGCTCAG CTTGGCCAAG GCCTGATCGG AGCGCTTGTC GCGCACGCCG TCGTGGATAC 600 

CGCACAGCGC ATTGCGAACG ATGGTGTCCA CATCGCGGTT CTCCAGCGCG TTGAGGTATC 660 

CCTGAATCGC GGTTTTGGCC GGTCCCTCCG AGAATGTGCC TGCCGTGTTG GCTCCGTTGG 72 0 

TGCGGACCCC GTATATGATC GCCGCCGTCA TAGCCGACAC CAGCGCGAGG GCTACCACAA 78 0 

TGCCGATCAG CAGCCGCTTG TGCCGTCGCT TCGGGTAGGA CACCTGCGGC GGCACGCCGG 84 0 

GATATGCGGC GGGCGGCAGC GCCGCGTCGT CTGCCGGTCC CGGGGCGAAG GCCGGTTCGG 900 

CGGCGCCGAG GTCGTGGGGG TAGTCCAGGG CTTGGGGTTC GTGGGATGAG GGCTCGGGGT 960 

ACGGCGCCGG TCCGTTGGTG CCGACACCGG GGTTCGGCGA GTGGGGACCG GGCATTGTGG 102 0 

TTCTCCTAGG GTGGTGGACG GGACCAGCTG CTAGGGCGAC AACCGCCCGT CGCGTCAGCC 108 0 

GGCAGCATCG GCAATCAGGT GAGCTCCCTA GGCAGGCTAG CGCAACAGCT GCCGTCAGCT 114 0 

CTCAACGCGA CGGGGCGGGC CGCGGCGCCG ATAATGTTGA AAGACTAGGC AACCTTAGGA 120 0 

ACGAAGGACG GAGATTTTGT GACGATC 1^2 7 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGGGCCGGC GGGGCCGGCG 60 

GGACCGGCGC TAACGGTGGT GCCGGCGGCA ACGCCTGGTT GTTCGGGGCC GGCGGGTCCG 120 

GCGGNGCCGG CACCAATGGT GGNGTCGGCG GGTCCGGCGG ATTTGTCTAC GGCAACGGCG 180 



{2} INFORMATION FOR SEQ ID NO: 37: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 290 base pairs 
(P) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGTGTCGGC GGCCGGGGCG 60 

GCGACGGCGT CTTTGCCGGT GCCGGCGGCC AGGGCGGCCT CGGTGGGCAG GGCGGCAATG 120 

GCGGCGGCTC CACCGGCGGC AACGGCGGTC TTGGCGGCGC GGGCGGTGGC GGAGGCAACG 18 0 

CCCCGGACGG CGGCTTCGGT GGCAACGGCG GTAAGGGTGG CCAGGGCGGN ATTGGCGGCG 24 0 

GCACTCAGAG CGCGACCGGC CTCGGNGGTG ACGGCGGTGA CGGCGGTGAC 2 90 
(2) INFORMATION FOR SEQ ID NO: 38: 

(j. ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
GATCCAGTGG CATGGNGGGT GTCAGTGGAA GCAT 34 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GATCGCTGCT CGTCCCC'^CC TTGCCGCCGA CGCCACCGGT CCCACCGTTA CCG7\ACAAGC 60 
TGGCGTGGTC GCCAGCAXC CCGG(::ACCGC CGACGCCGGA GTCGAACAAT GGCACCGTCG 12 0 

TATCCCCACC ATTGCCi"::CG GNCCCACCGG CACCG 155 
(2) INFORMATION FOR SEQ ID NO : 4 0 : 
(i) SEQUENCE ':HARACTERI ST I CS : 
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(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
ATGGCGTTCA CGGGGCGCCG GGGACCGGGC AGCCCGGNGG GGCCGGGGGG TGG 53 
(2) INFORMATION FOR SEQ ID NO : 4 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GATCCACCGC GGGTGCAGAC GGTGCCCGCG GCGCCACCCC GACCAGCGGC GGCAACGGCG 60 
GCACCGGCGG CAACGGCGCG AACGCCACCG TCGTCGGNGG GGCCGGGGGG GCCGGCGGCA 120 
AGGGCGGCAA CG 132 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:>: 
GATCGGCGGC CGGNACGGNC GGGGACGGCG GCAAGGGCGG NAAC^GGGGC GCCGNAGCCA 60 
CCNGCCAAGA ATi:CTCCGNG TCCNCCAATG GCGCGAATGG CGGA:AGGGC GGC/\ACGGCG 120 
GCANCGGCGG CA 132 
(2) INFORMATION FOR SEQ ID NO : 4 3 : 
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{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 60 

CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC 120 

ATGAACGGGC GGCATCAA.^\T TAGTGCAGGA ACCTTTCAGT TTAGCGACGA TAATGGCTAT 180 

AGCACTAAGG AGGATGATCC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 24 0 

AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTCC 3U0 

CCATCACACC GTGCGAACTC ACGGNGGNTA AAAACGCCGC CCAACAGNTG GTNTTGTCCG 360 

CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 4 20 

CGCTGCGCAA CGCGGCCAAG GNGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 4 80 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 54 0 

CGGCCGAACT AACCGATACG CCGAGGGTGG CCACGGCCGG TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAAGG CGCATCGCTC GCGCACTGNG 6 60 

GGGATGGGTG GAACACTTNC ACCCTGACGC TGCAAGGCGA CG 7 02 
(2) INFORMATION FOR SEQ ID NO : 4 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SKQ ID NC : 4 4 : 

GAAGCCGCAC CGCTCTCGGG CGACGTGGCG GTC/VJ\GCGG CATCGCTCGG TGGCGGTGGA (>0 

GCCGGCGGGG TGCCGTCGGC GCCGTTGGGA TCCGCGATCG GGGGCGCCGA ATCGGTGCGG 120 

CCCGCTGGCG CTGGTGACAT TGCCGGCTTA Gi:;Ci:AGGGAA GGGCCGGCGG CGGCGCCGCG 180 
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CTGGGCGGCG GTGGCATGGG AATGCCGATG GGTGCCGCGC ATCAGGGACA AGGGGGCGCC 24 0 

AAGTCCAAGG GTTCTCAGCA GGAAGACGAG GCGCTCTACA CCGAGGATCC TCGTGCCG 298 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

CGGCACGAGG ATCGAATCGC GTCGCCGGGA GCACAGCGTC GCACTGCACC AGTGGAGGAG 60 

CCATGACCTA CTCGCCGGGT AACCCCGGAT ACCCGCAAGC GCAGCCCGCA GGCTCCTACG 120 

GAGGCGTCAC ACCCTCGTTC GCCCACGCCG ATGAGGGTGC GAGCAAGCTA CCGATGTACC 180 

TGAACATCGC GGTGGCAGTG CTCGGTCTGG CTGCGTACTT CGCCAGCTTC GGCCCAATGT 240 

TCACCCTCAG TACCGAACTC GGGGGGGGTG ATGGCGCAGT GTCCGGTGAC ACTGGGCTGC 300 

CGGTCGGGGT GGCTCTGCTG GCTGCGCTGC TTGCCGGGGT GGTTCTGGTG CCTAAGGCCA 360 

AGAGCCATGT GACGGTAGTT GCGGTGCTCG GGGTACTCGG CGTATTTCTG ATGGTCTCGG 4 20 

CGACGTTTAA CAAGCCCAGC GCCTATTCGA CCGGTTGGGC ATTGTGGGTT GTGTTGGCTT 4 80 

TCATCGTGTT CCAGGCGGTT GCGGCAGTCC TGGCGCTCTT GGTGGAGACC GGCGCTATCA 54 0 

CCGCGCCGGC GCCGCGGCCC AAGTTCGACC CGTATGGACA GTACGGGCGG TACGGGCAGT t.(jO 

ACGGGGAGTA CGGGGTGCAG CCGGGTGGGT ACTACGGTCA GCAGGGTGCT CAGCAGGCCG 660 

CGGGACTGCA GTCGCCCGGC CCGCAGCAGT CTCCGCAGCC TCCCGGATAT GGGTCGCAGT '-'2 0 

ACGGCGGCTA TTCGTCCAGT CCGAGCCAAT CGGGCAGTGG ATACACTGCT CAGCCCCCGG 7 80 

CCCAGCCGCC GGCGCA':^.TCC GGGTCGCAAG .AATCGCACCA GGGCCCATCC ACGCCACCTA BAO 

CCGGCTTTCC GAGCTTGAGC CCACCACCAC CGGTCAGTGC CGGGACGi^GG TCGCAGGCTG 900 

GTTCGGCTCC AGTC.AACTAT TCAAACCCCA GCGGGGGCGA GCAGTCGTCG TCCCCCGGGG ^^60 

GGGCGCCGGT CTAACCGGGC GTTCCCGCGT CCGGTCGCGC GTGTGCGCGA AGAGTGAACA 1020 

GGGTGTCAGC AAGCGCGGAC GATCCTCGTG CCGAATTC 105B 
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(2) INFORMATION FOR SEQ ID NO : 4 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 
(D) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 6 : 

CGGCACGAGA GACCGATGCC GCTACCCTCG CGCAGGAGGC AGGTAATTTC GAGCGGATCT 6 0 

CCGGCGACCT GAAAACCCAG ATCGACCAGG TGGAGTCGAC GGCAGGTTCG TTGCAGGGCC 120 

AGTGGCGCGG CGCGGCGGGG ACGGCCGCCC AGGCCGCGGT GGTGCGCTTC CAAGAAGCAG 180 

CCAATAAGCA GAAGCAGGAA CTCGACGAGA TCTCGACGAA TATTCGTCAG GCCGGCGTCC 24 0 

AATACTCGAG GGCCGACGAG GAGCAGCAGC AGGCGCTGTC CTCGCAAATG GGCTTCTGAC 30 0 

CCGCTAATAC G7VAAAGAAAC GGAGCAA 32 7 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
CGGTCGCGAT GATGGCGTTG TCGAACGTGA CCGATTCTGT ACCGCCGTCG TTGAGATCAA 60 
CCAACAACGT GTTGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTGTTCT 120 
TCTTCATCAG GAAGTGCACA CCGGCCACCC TGCCCTCGGN TACCTTTCGG 17 0 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUtlNCE DESCRIPTION: SEQ ID NO: 48: 



GATCCGGCGG CACGGGGGGT GCCGGCGGCA GCACCGCTGG CGCTGGCGGC AACGGCGGGG 



60 



CCGGGGGTGG CGGCGGAACC GGTGGGTTGC TCTTCGGCAA CGGCGGTGCC GGCGGGCACG 



120 



GGGCCG' 



127 



(2) INFORMATION FOR SEQ ID NO : 4 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
CGGCGGC7VAG GGCGGCACCG CCGGCAACGG GAGCGGCGCG GCCGGCGGCA ACGGCGGCAA 60 
CGGCGGCTCC GGCCTCAACG G 81 
(2) INF0Rb4ATI0N FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 149 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GATCAGGGCT GGCCGGCTCC GGCCAGAAGG GCGGTAACGG AGGAGCTGCC GGATTGTTTG 60 

GCAACGGCGG GGCCGGNGGT GCCGGCGCGT CC/U^CCAAGC CGGTAACGGC GGNGCCGGCG 120 

GAAACGGTGG TGCCGGTGGG CTGATCTGG 14 9 

{?.) INFORt^ATIOtl FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 base pairs 
(D) TYPE: nucleic acid 
(C) STR^^NDEDNESS : single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CGGCACGAGA TCACACCTAC CGAGTGATCG AGATCGTCGG GACCTCGCCC GACGGTGTCG 60 

ACGCGGNAAT CCAGGGCGGT CTGGCCCGAG CTGCGCAGAC CATGCGCGCG CTGGACTGGT 120 

TCGAAGTACA GTCAATTCGA GGCCACCTGG TCGACGGAGC GGTCGCGCAC TTCCAGGTGA 130 

CTATGAAAGT CGGCTTCCGC CTGGAGGATT CCTGAACCTT CAAGCGCGGC CGAT7VACTGA 24 0 

GGTGCATCAT TAAGCGACTT TTCCAGAACA TCCTGACGCG CTCGAAACGC GGTTCAGCCG 3 00 

ACGGTGGCTC CGCCGAGGCG CTGCCTCCAA AATCCCTGCG ACA.ATTCGTC GGCGG 3 55 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

ATGCATCACC ATCACCATCA CATGCATCAG GTGGACCCCA ACTTGACACG TCGCAAGGGA 60 

CGATTGGCGG CACTGGCTAT CGCGGCGATG GCCAGCGCCA GCCTGGTGAC CGTTGCGGTG 120 

CCCGCGACCG CCAACGCCGA TCCGGAGCCA GCGCCCCCGG TACCCACAAC GGCCGCCTCG 180 

CCGCCGTCGA CCGCTGCAGC GCCACCCGCA CCGGCGACAC CTGTTGCCCC CCCACCACCG 2';0 

GCCGCCGCCA ACACGCCGAA TGCCCAGCCG GGCGATCCCA ACGCAGCACC TCCGCCGGCC 300 

GACCCGAACG CACCGCCGCC ACCTGTCATT GCCCCAAACG CACCCCAACC TGTCCGGATC 3b0 

GACAACCCGG TTGGAGGATT CAGCTTCGCG CTGCCTGCT':^ 'l^CTGGGTGGA GTCTGACGCC 120 

GCCCACTTCG ACTACGGTTG AGCACTCCTC AGC7VA/iAC':A CCGGGGACCC GCCATTTCCC -If'^O 

GGAlAGCCGC CGCC'^'VrGGC !":AATGACAr-C CGTATCGT^"-;C TCGCCCGGrT AGACCAAAAG 50 0 

CTTTACGCCA GCGCCGAAGC i:ACCGACTCC AAGGCCGCGG i^CCGGTTGGG CTCGGACATG 600 

GGTGAGTTCT ATATi:,CCCTA '^CCGGGCACC CGGATCAACC AGGTVAACCGT CTCGCTCGAC 660 
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GCCAACGGGG TGTCTGGAAG CGCGTCGTAT TACGAAGTCA AGTTCAGCGA TCCGAGTAAG 720 

CCGAACGGCC AGATCTGGAC GGGCGTAATC GGCTCGCCCG CGGCGAACGC ACCGGACGCC 780 

GGGCCCCCTC AGCGCTGGTT TGTGGTATGG CTCGGGACCG CCAACAACCC GGTGGACAAG 84 0 

GGCGCGGCCA AGGCGCTGGC CGAATCGATC CGGCGTTTGG TCGCCCCGCC GCCGGCGCCG 900 

GCACCGGCTC CTGCAGAGCC CGCTCCGGCG CCGGCGCCGG CCGGGGAAGT CGCTCCTACC 960 

CCGACGACAC CGACACCGCA GCGGACCTTA CCGGCCTGA 999 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: sinqle 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Met His His His His His His Met His Gin Val Asp Pro Asn Leu Thr 
15 10 15 

Arg Arg Lys Gly Arg Leu Ala Ala Leu Ala He Ala Ala Met Ala Ser 
20 25 30 

Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro 
35 40 45 

Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr 
50 55 60 

Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro Val Ala Pro Pro Pro Pro 
65 70 75 80 

Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro Gly Asp Pro Asn Ala Ala 
85 90 95 

Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro Pro Pro Val lie Ala Pro 
100 105 110 

Asn Ala Pro Gin Pro Val Arg He Asp Asn Pro Val Gly Gly Phn Ser 
115 120 125 

Phe Ala Leu Pro Ala Gly Trp Val Glu Ser Asp Ala Ala His Phe Asp 
130 135 140 



Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr Gly Asp ^''ro Pro Phe Pro 
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14b 150 155 160 

Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg He Val Leu Gly Arg 
165 170 175 

Leu Asp Gin Lys Leu Tyr Ala Ser Ala Giu Ala Thr Asp Ser Lys Ala 
180 185 190 

Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 
195 200 205 

Gly Thr Arg He Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val 
210 215 220 

Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys 
225 230 235 240 

Pro Asn Gly Gin He Trp Thr Gly Val He Gly Ser Pro Ala Ala Asn 
245 250 255 

Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp Leu Gly 
260 265 270 

Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu 
275 280 285 

Ser He Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro 
290 295 300 

Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr 
305 310 315 320 

Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 
325 330 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Asp Pro Val Asp Ala Val He Asn Thr Thr Xaa Asn Tyr Gly Gin Val 
1 5 10 15 

Val Ala Ala Leu 

20 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

Ala Val Glu Ser GI y Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 6 : 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
15 10 15 

Glu Gly Arq 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 
15 10 15 

(2) INFORMATION FOP SEQ II) N0:5B: 
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SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
(b) TYPE: amino acid 
(■:■) STRANDEDNESS : 
{D} TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val Pro 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEC'UENGE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 
(3) TYPE: amino acid 
(C) STRANDEDNESS: 
(H) TOPOLOGY: linear 



(xj ) SEQUENCE DESCRIPTION: SEQ ID MO: 60: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 

15 10 15 

AJ a 



(2) INFORMATION FOR SEQ ID NO: 6.1 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr Ser 

15 10 15 

Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 

20 25 30 

{2} INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Gly Cys Gly Asp Arq Ser Gly Gly Asn Leu Asp Gin lie Arq Leu Arg 
1 5 10 15 

Arg Asp Arg Ser G.l y Gly Asn Leu 

20 
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{2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 187 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Thr Gly Ser Leu Asn Gin Thr His Asn Arg Arg Ala Asn Glu Arg Lys 
15 10 15 

Asn Thr Thr Met Lys Met Val Lys Ser lie Ala Ala Gly Leu Thr Ala 
20 25 30 

Ala Ala Ala He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala 
35 40 45 

Gly Gly Pro Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro 
50 55 60 

Leu Pro Leu Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin 
65 70 75 80 

Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala 
85 90 95 

Asn Lys Gly Ser Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg 
100 105 110 

He Ala Asp His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro 
115 120 125 

Leu Ser Phe Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala 
130 135 140 

Thr Ala Asp Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr 
145 150 155 160 

Gin Asn Val Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala 
165 170 175 

Ser Ala Met Glu Leu Leu Gin Ala Ala Gly Xaa 

180 185 

{2} INFORMATION FOR SEQ 10 NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 8 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Asp Glu Vai Thr Vai Glu Thr Thr Scr Val Phe Arg Ala Asp Phe Leu 
1 5 10 15 

Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Ser 

20 25 30 

Gly Val Glu Gly Leu Pro Pro Gly Ser Ala Leu Leu Val Val Lys Arg 
35 no 45 

Gly Pro Asn Ala Gly Ser Arg Phe Leu Leu Asp Gin Ala lie Thr Ser 
50 55 60 

Ala Gly Arg His Pro Asp Ser Asp lie Phe Leu Asp Asp Val Thr Val 
65 70 75 80 

Ser Arg Arg His Ala Glu Phe Arg Leu Glu Asn Asn Glu Phe Asn Val 
85 90 95 

Val Asp Val Gly Ser Leu Asn Gly Thr Tyr Val Asn Arg Glu Pro Val 
100 105 110 

Asp Ser Ala Val Leu Ala Asn Gly Asp Glu Val Gin He Gly Lys Leu 
115 120 125 

Arg Leu Val Phe Leu Thr Gly Pro Lys Gin Gly Glu Asp Asp Gly Ser 
130 135 140 

Thr Gly Gly Pro 
145 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Thr Ser Asn Arg Pro Ala Arg Arg Gly Arg Arg Ala Pro Arg Asp Thr 
1 5 10 15 
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Gly Pro Asp Arg Ser Ala Ser Leu Ser Leu Val Arg His Arg Arg Gin 
20 25 30 

Gin Arg Asp Ala Leu Cys Leu Ser Ser Thr Gin lie Ser Arg Gin Ser 
35 40 45 

Asn Leu Pro Pro Ala Ala Gly Gly Ala Ala Asn Tyr Ser Arg Arg Asn 
50 55 60 

Phe Asp Val Arg lie Lys lie Phe Met Leu Val Thr Ala Val Vai Leu 
65 70 75 80 

Leu Cys Cys Ser Gly Val Ala Thr Ala Ala Pro Lys Thr Tyr Cys Giu 
85 90 95 

Giu Leu Lys Gly Thr Asp Thr Gly Gin Ala Cys Gin lie Gin Met Ser 
100 105 110 

Asp Pro Ala Tyr Asn lie Asn lie Ser Leu Pro Ser Tyr Tyr Pro Asp 
115 120 125 

Gin Lys Ser Leu Giu Asn Tyr lie Ala Gin Thr Arg Asp Lys Phe Leu 
130 135 140 

Ser Ala Ala Thr Ser Ser Thr Pro Arg Giu Ala Pro Tyr Giu Leu Asn 
145 150 155 160 

IlG Thr Ser Ala Thr Tyr Gin Ser Ala lie Pro Pro Arg Gly Thr Gin 
165 170 175 

Ala Val Val Leu Xaa Vai Tyr His Asn Ala Gly Gly Thr His Pro Thr 
180 185 190 

Thr Thr Tyr Lys Ala Phe Asp Trp Asp Gin Ala Tyr Arg Lys Pro lie 
195 200 205 

Thr Tyr Asp Thr Leu Trp Gin Ala Asp Thr Asp Pro Leu Pro Val Vai 
210 215 220 

Phe Pro lie Vai Ala Arg 
225 230 

(2) INFORMATION FOR SEQ ID NO: 67: 

{i; SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 amino acids 

(B) TYPE: amino acid 

(C) STKANDEDNESS : sinqJe 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe 
15 10 15 

Ala lie Pro lie Gly Gin Ala Met Ala lie Ala Gly Gin lie Arg Ser 
20 25 30 

Gly Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly 
35 40 4b 

Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val 
50 55 60 

Va.l Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val 
65 70 75 80 

He Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala 
85 90 95 

Asp Ala Leu Asn Gly His His Pro Gly Asp Val He Ser Val Asn Trp 
100 105 HO 

Gin Thr Lys Ser Gly Gly Thr Arq Thr Gly Asn Val Thr Leu Ala Glu 
115 120 125 

Gly Pro Pro Ala 
130 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(x.i} SEQUENCE DESCRIPTION: SEQ ID NO:68: 

Val Pro Leu Arg Ser Pro Ser Met Ser Pro Ser Lys Cys Leu Ala Ala 
15 10 15 

Ala Gin Arg Asn Pro Val He Arg Arg Arq Arg Leu Ser Asn Pro Pro 
20 25 30 

Pro Arg Lys Tyr Arg Ser Met Pro Ser Pro AJ a Thr Ala Ser Ala Gly 
35 40 45 

Met A.l a Arg Val Arg Arg Arg Ala He Trp Arg Gly Pro Ala Thr Xaa 
50 55 60 
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Ser Ala Gly Met Ala Arg Val Arg Arg Trp Xaa Val Met Pro Xaa Val 
65 70 75 80 

lie Gin Ser Thr Xaa lie Arg Xaa Xaa Gly Pro Phe Asp Asn Arg Gly 
85 90 95 

Ser Glu Arg Lys 
ICQ 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Met Thr Asp Asp lie Leu Leu lie Asp Thr Asp Glu Arg Val Arg Thr 
15 10 15 

Leu Thr Leu Asn Arg Pro Gin Ser Arg Asn Ala Leu Ser Ala Ala Leu 
20 25 30 

Arg Asp Arg Phe Phe Ala Xaa Leu Xaa Asp Ala Glu Xaa Asp Asp Asp 
35 40 45 

lie Asp Val Val He Leu Thr Gly Ala Asp Pro Val Phe Cys Ala Gly 
50 55 60 

Leu Asp Leu Lys Val Ala Gly Arg Ala Asp Arg Ala Ala Gly His Leu 
65 70 75 80 

Thr Ala Val Gly Gly His Asp Gin Ala Gly Asp Arg Arg Asp Gin Arg 
85 90 95 

Arg Arg Gly His Arg Arg Ala Arg Thr Gly Ala Val Leu Arg His Pro 
100 105 110 

Asp Arg Leu Arg Ala Arg Pro Leu Arg Arg His Pro Arg Pro Gly Gly 
115 120 125 

Ala Ala Ala His Leu Gly Thr Gin Cys Val Leu Ala Ala Lys Gly Arq 
130 135 140 

His Arg Xaa Gly Pro Val Asp Glu Pro Asp Arg Arg Leu Pro Val Arg 
145 150 155 160 



Asp Arg Arg 
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(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 344 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:70: 

Met Lys Phe Val Asn His lie Glu Pro Val Ala Pro Arg Arg Ala Gly 
1 5 10 15 

Gly Ala Val 7.xa Glu Val Tyr Ala Glu Ala Arg Arg Glu Phe Gly Arg 
20 25 30 

Leu Pro Glu Pro Leu Ala Met Leu Ser Pro Asp Glu Gly Leu Leu Thr 
35 40 45 

Ala Gly Trp Ala Thr Leu Arg Glu Thr Leu Leu Val Gly Gin Val Pro 
50 55 60 

Arg Gly Arg Lys Glu Ala Val Ala Ala Ala Val Ala Ala Ser Leu Arg 
65 70 75 80 

Cys Pro Trp Cys Val Asp Ala His Thr Thr Met Leu Tyr Ala Ala Gly 

85 90 95 

Gin Thr Asp Thr Ala Ala Ala lie Leu Ala Gly Thr Ala Pro Ala Ala 
100 105 110 

Gly Asp Pro Asn Ala Pro Tyr Val Ala Trp Ala Ala Gly Thr Gly Thr 
115 120 125 

Pro Ala Gly Fro Pro Ala Pro Phe Gly Pro Asp Val Ala Ala Glu Tyr 
130 135 140 



Leu Gly Thr /"-la Val Gin Phe His 
145 150 

Leu Lgu Asp Glu Thr Phe Leu Pro 
165 

Met Arg Arg ;-la Gly Gly Leu Val 
:30 



Phe lie Ala Arg Leu Val Leu Val 
155 160 

Gly Gly Pro Arg Ala Gin Gin Leu 
170 175 

Phe Ala Arg Lys Val Arg Ala Glu 
185 190 



His Arg Pro Gly Arg Ser Thr Arg Arg Leu Glu Pro Arg Thr Leu Pro 
195 200 205 



Asp Asp Leu 



a Trp Ala Thr Pro Gor Glu Pro Tie Ala Thr Ala Phe 
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210 215 220 

Ala Ala Leu Ser His His Leu Asp Thr Ala Pro His Leu Pro Pro Pro 
225 230 235 240 

Thr Arg Gin Val Val Arg Arg Val Val Gly Ser Trp His Gly Glu Pro 
245 250 255 

Met Pro Met Ser Ser Arg Trp Thr Asn Glu His Thr Ala Glu Leu Pro 
260 265 270 

Ala Asp Leu }iis Ala Pro Thr Arg Leu Ala Leu Leu Thr Gly Leu Ala 
275 280 285 

Pro His Gin Val Thr Asp Asp Asp Val Ala Ala Ala Arg Ser Leu Leu 
290 295 300 

hsp Tnr Asp Ala Ala Leu Val Gly Ala Leu Ala Trp Ala Ala Phe Thr 
305 310 315 320 

Ala Ala Arg Arg lie Gly Thr Trp He Gly Ala Ala Ala Glu Gly Gin 
325 330 335 

Val Ser Arg Gin Asn Pro Thr Gly 
340 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 485 amino acids 
(D) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:71: 

Asp Asp Pro Asp Met Pro Gly Thr Val Ala Lys Ala Val Ala Asp Ala 

1 5 10 15 

Leu Gly Arg Gly He Ala Pro Val Glu Asp He Gin Asp Cys Val Glu 

20 25 30 

Ala Arq Leu Gly Glu Ala Gly Leu Asp Asp Val Ala Arg Val Tyr He 

35 40 45 

He Tyr Arg Gin Arg Arg Ala Glu Leu Arg Thr Ala Lys Ala Leu Leu 

50 55 60 



Gly Vil Arg Asp Glu Leu Lys Leu Ser Lgu Ala A] a Val Thr Val Leu 
65 7 0 7 5 8 0 
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Arg Glu Arg Tyr Leu Leu His Asp Glu Gin Gly Arg Pro Ala Glu Ser 
85 90 9b 

Thr Gly Glu Leu Met Asp Arg Ser Ala Arg Cys Val Ala Ala Ala Glu 
100 105 110 

Asp Gin Tyr Glu Pro Gly Ser Ser Arg Arg Trp Ala Glu Arg Phe Ala 
115 120 125 

Thr Leu Leu Arg Asn Leu Glu Phe Leu Pro Asn Ser Pro Thr Leu Met 
130 135 140 

Asn Ser Gly Thr Asp Leu Gly Leu Leu Ala Gly Cys Phe Val Leu Pro 
145 150 155 160 

He Glu Asp Ser Leu Gin Ser He Phe Ala Thr Leu Gly Gin Ala Ala 
165 170 175 

Glu Leu Gin Arg Ala Gly Gly Gly Thr Gly Tyr Ala Phe Ser His Leu 
180 105 190 

Arg Pro Ala Gly Asp Arg Val Ala Ser Thr Gly Gly Thr Ala Ser Gly 
195 200 205 

Pro Val Ser Phe Leu Arg Leu Tyr Asp Ser Ala Ala Gly Val Val Ser 
210 215 220 

Met Gly Gly Arg Arg Arg Gly Ala Cys Met Ala Val Leu Asp Val Ser 
225 230 235 240 

His Pro Asp He Cys Asp Phe Val Thr Ala Lys Ala Glu Ser Pro Ser 
245 250 255 

Glu Leu Pro His Phe Asn Leu Ser Val Gly Val Thr Asp Ala Phe Leu 
260 265 270 

Arg Ala Val Glu Arg Asn Gly Leu His Arg Leu Val Asn Pro Arg Thr 

275 280 2S5 

Gly Lys He Val Ala Arg Met Pro Ala Ala Glu Leu Phe Asp AJ a He 
290 295 300 

Cys Lys Ala Ala His Ala Gly Gly Asp Pro Gly Leu Val Phe Leu Asp 
305 310 315 320 

Thr He Asn Arg Ala Asn Pro Vai Pro Gly Arg Gly Arg He Glu Ala 
325 330 335 

Thr Asn Pro Cys Gly Glu Val Pre* Leu Leu Pro Tyr Glu Ser Cys Asn 
340 345 350 

Leu Gly Scr lie Asn Leu Ala Arc Met Leu Ala Asp Gly Arg Val Asp^ 
355 360 365 



Trp Asp Arg Leu Glu Glu Val Ala Gly Val Ala Val Arg Pho Leu Asp 
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370 375 380 

Asp Val lie Asp Val Ser Arg Tyr Pro Pho Pro Glu Leu Gly Glu Ala 
3B5 390 395 400 

Ala Arg Ala Thr Arg Lys lie Gly Leu Gly Val Met Gly Leu Ala Glu 
405 410 415 

Leu Leu Ala Ala Leu Gly lie Pro Tyr Asp Ser Glu Glu Ala Val Arg 
420 425 430 

Leu Ala Thr Arg Leu Met Arg Arg He Gin Gin Ala Ala His Thr Ala 
435 440 445 

Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ala Phe Thr Asp 
450 455 460 

Ser Arg Phe Ala Arg Ser Gly Pro Arg Arg Asn Ala Gin Val Thr Ser 
465 470 475 480 



Val Ala Pro Thr Gly 
485 

(2) INB'ORMATION FOR SEQ ID NO: 72: 



1 5 

He Tyr Trp Arg Arg Arg Gly 
20 

Val Gly He Ala Val Ala 
35 

Gly Ala Lys Pro Val Ser 
50 

Pro Gly Ser Pro Ala Pro 
65 

Gly Asn Ala Ala Ala Ala Pro 

8 5 



Gly Pro Leu Pro Thr Glu 
10 15 

Leu Ala Leu Gly He Ala Val Val Val 
25 30 

He Ala Phe Val Asp Ser Ser Ala 
45 

Lys Pro Ala Ser Ala Gin Ser His 
60 

Pro Gin Pro Ala Gly Gin Thr Glu 

75 80 

Pro Gin Gly Gin A.sn Pro Glu Thr Pro 
90 95 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: 
Gly Val He Val Leu Asp Leu Glu Pro Arg 



He Val 
40 

Ala Asp 
55 

Gin A J. a 
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Thr Pro Thr Ala Ala Val Gin Pro Pro Pro Val Leu Lys Glu Gly Asp 
100 105 110 

Asp Cys Pro Asp Ser Thr Leu Ala Val Lys Gly Leu Thr Asn Ala Pro 
115 120 125 

Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met Val Val Thr Asn 
130 135 140 

He Gly Leu Val Sor Cys Lys Arg Asp Val Gly Ala Ala Val Leu Ala 
145 150 155 160 

Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg Leu Trp Ser Asn Leu Asp 
165 170 175 

Cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly Glu 
180 185 190 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 
195 200 205 

Cys Pro Leu Pro Arg Pro Ala He Gly Pro Gly Thr Tyr Asn Lou Val 
210 215 220 

Val Gin Leu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe He Leu Asn 
225 230 235 240 

Gin Pro Pro Pro Pro Pro Gly Pro Val Pro Ala Pro Gly Pro Ala Gin 
245 250 255 

Ala Pro Pro Pro Glu Ser Pro Ala Gin Gly Gly 
260 265 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Leu He Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly Val Gin Val 
1 5 10 15 

Thr Asn Asp i-ys Asp Thr Pro G.I y Ala Lys He Val Glu Val Val Ala 
20 25 30 

Gly Gly Ala Ala Ala Asn Ala GJ y Val Pro Lyr, Gly Val Val Val Thr 
35 40 45 
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Lys Val Asp Asp Arg 
50 

Val Arg Ser Lys Ala 
65 

Pro Ser Gly Gly Ser 
85 

Gin 



Pro lie Asn Ser Ala Asp 
55 

Pro Gly Ala Thr Val Ala 
70 75 

Arg Thr Val Gin Val Thr 
90 



Ala Leu Val Ala Ala 
60 

Leu Thr Phe Gin Asp 
80 

Leu Gly Lys Ala Glu 
95 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 364 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 

Gly Ala Ala Val Ser Leu Leu Ala Ala Gly Thr Leu Val Leu Thr Ala 
15 10 15 

Cys Gly GJ y Gly Thr Asn Ser Ser Ser Ser Gly Ala Gly Gly Thr Ser 
20 25 30 

Gly Ser Val His Cys Gly Gly Lys Lys Glu Leu His Ser Ser Gly Ser 
35 40 AS 

Thr Ala Gin Glu Asn Ala Met Glu Gin Phe Val Tyr Ala Tyr Val Arg 

50 55 60 

Ser Cys Pro Gly Tyr Thr Leu Asp Tyr Asn Ala Asn Gly Ser G] y Ala 
65 70 75 80 

Gly Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asp 
85 90 95 

Val Pro Leu Asn Pro Scr Thr Gly Gin Pro Asp Arg Ser Ala Glu Arg 
300 105 110 

Cys Gly Ser Pro Ala Trp Asp Leu Pro Thr Val Phe Gly Pro lie Ala 
115 120 12 5 

lie Thr Tyr Asn lie Lys Gly Val Ser Thr Lou Asn Leu Asp Gly Pro 
130 135 MO 



Thr Thr Ala Lys Ilo Phe Asn Gly Thr He Thr Val Trp Asn Asp Pro 
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145 150 155 160 

Gin lie Gin Ala Leu Asn Ser Gly Thr Asp Leu Pro Pro Thr Pro lie 
165 170 17b 

Ser Val lie Phe Arg Ser Asp Lys Ser Gly Thr Ser Asp Asn Phe Gin 
180 185 190 

Lys Tyr Leu Asp Gly Val Ser Asn Gly Ala Trp Gly Lys Gly Ala Ser 
195 200 205 

Glu Thr Phe Ser Gly Gly Val Gly Val Gly Ala Ser Gly Asn Asn Gly 
210 215 220 

Thr Ser Ala Leu Leu Gin Thr Thr Asp Gly Ser lie Thr Tyr Asn Glu 
225 230 235 240 

Trp Ser Phe Ala Val Gly Lys Gin Leu Asn Met Ala Gin He He Thr 
245 250 255 

Ser Ala Gly Pro Asp Pro Val Ala He Thr Thr Glu Ser Val Gly Lys 
260 265 270 

Thr He Ala Gly Ala Lys He Met Gly Gin Gly Asn Asp Leu Val Leu 
275 280 285 

Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro He 
290 295 300 

Val Leu Ala Thr Tyr Glu He Val Cys Ser Lys Tyr Pro Asp Ala Thr 
305 310 315 320 

Thr Gly Thr Ala Val Arg Ala Phe Met Gin Ala Ala He Gly Pro Gly 
325 330 335 

Gin Glu Gly Leu Asp Gin Tyr Gly Ser He Pro Leu Pro Lys Ser Phe 
340 345 350 

Gin Ala Lys Leu Ala Ala Ala Val Asn Ala He Ser 

355 360 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 309 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ TD NO:75: 
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Gin Ala Ala Ala Gly Arg Ala Val Arg Arg Thr Gly His Ala Glu Asp 
15 10 15 

Gin Thr His Gin Asp Arg Leu His His Gly Cys Arg Arg Ala Ala Val 
20 25 30 

Val Val Arg Gin Asp Arg Ala Ser Val Ser Ala Thr Ser Ala Arg Pro 
35 40 45 

Pro Arg Arg His Pro Ala Gin Gly His Arg Arg Arg Val Ala Pro Ser 
50 55 60 

Gly Gly Arg Arg Arg Pro His Pro His His Val Gin Pro Asp Asp Arg 

65 70 75 80 

Arg Asp Arg Pro Ala Leu Leu Asp Arg Thr Gin Pro Ala Glu His Pro 

85 90 95 

Asp Pro His Arg Arg Gly Pro Ala Asp Pro Gly Arg Val Arg Gly Arg 
100 105 110 

Gly Arg Leu Arg Arg Val Asp Asp Gly Arg Leu Gin Pro Asp Arg Asp 
115 120 125 

Ala Asp His Gly Ala Pro Val Arg Gly Arg Gly Pro His Arg Gly Val 
130 135 140 

Gin His Arg Gly Gly Pro Val Phe Val Arg Arg Val Pro Gly Val Arg 
145 150 155 160 

Cys Ala His Arg Arg Gly His Arg Arg Val Ala Ala Pro Gly Gin Gly 
165 170 175 

Asp Val Leu Arg Ala Gly Leu Arg Val Glu Arg Leu Arg Pro Val Ala 
180 185 190 

Ala Val Glu Asn Lgu His Arg Gly Ser Gin Arg Ala Asp Gly Arg Val 

195 200 205 

Phe Arg Pro lie Arg Arg Gly Ala Arg Leu Pro Ala Arg Arg Ser Arg 
210 215 220 

Ala Gly Pro Gin Gly Arg Leu His Leu Asp Gly Ala Gly Pro Ser Pro 
225 230 235 240 

Leu Pro Ala Arg Ala Gly Gin Gin Gin Pro Ser Ser Ala Gly Gly Arg 
245 250 255 

Arg Ala Gly Gly Ala Glu Ar^T Ala Asp Pro Gly Gin Arg Gly Arg His 
260 265 270 

His Gin Gly Gly His Asp Pro Gly Arg Gin Gly Ala Gin Arg Gly Thr 
275 280 285 



Ala Gly Val Ala His Ala Ala Ala Gly Pro Arg Arg Ala Ala Val Arg 
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290 295 300 

Asn Arg Pro Arg Arg 
305 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 6 : 

Ser Ala Val Trp Cys Leu Asn Gly Phe Thr Gly Arg His Arg His Gly 
1 5 10 15 

Arg Cys Arg Val Arg Ala Ser Gly Trp Arg Ser Ser Asn Arg Trp Cys 
20 25 30 

Ser Thr Thr Ala Asp Cys Cys Ala Ser Lys Thr Pro Thr Gin Ala Ala 
35 40 45 

Ser Pro Leu Glu Arg Arg Phe Thr Cys Cys Ser Pro Ala Val Gly Cys 
50 55 60 

Arg Phe Arg Ser Phe Pro Val Arg Arg Leu Ala Leu Gly Ala Arg Thr 
65 70 75 80 

Ser Arg Thr Leu Gly Val Arg Arg Thr Leu Ser Gin Trp Asn Leu Ser 
85 90 95 

Pro Arg Ala Gin Pro Ser Cys A] a VaJ Thr Val Glu Ser His Thr His 
100 105 110 

Ala Ser Pro Arg Met Ala Lys Leu Ala Arg Val Val Gly Leu Val Gin 
115 120 125 

Glu Glu Gin Pro Ser Asp Met Thr Asn His Pro Arg Tyr Ser Pro Pro 
130 135 140 

Pro Gin Gin Pro Gly Thr Pro Gly Tyr Ala Gin Gly Gin Gin Gin Thr 
145 150 155 160 

Tyr Ser Gin Gin Phe Asp Trp Arg Tyr Pro Pro Ser Pro Pro Pro Gin 
165 170 175 



Pro Thr Gin Tyr Arq Gin Pro Tyr Glu Ala Leu Gly Gly Thr Arg Pro 
380 185 190 
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Gly Leu lie Pro Giy Val lie Pro Thr Met Thr Pro Pro Pro Gly Met 
195 9.00 205 

Vai Arg Gin Arg Pro Arg Ala Gly Met Leu Ala lie Giy Ala Val Thr 
210 215 220 

lie Ala Val Val Ser Ala Gly He Gly Gly Ala Ala Ala Ser Leu Val 
225 230 235 240 

Gly Phe Asn Arg Ala Pro Ala Gly Pro Ser Gly Gly Pro Val Ala Ala 
24b 250 255 

Ser Ala Ala Pro Ser He Pro Ala Ala Asn Met Pro Pro Gly Ser Val 
260 265 270 

Glu Gin Val Ala Ala Lys Val Val Pro Ser Val Val Met Leu Glu Thr 
275 280 285 

Asp Leu Gly Arg Gin Ser Glu Glu Gly Ser Gly He He Leu Ser Ala 
290 295 300 

Glu Gly Leu He Leu Thr Asn Asn His Val He Ala Ala Ala Ala Lys 
305 310 315 320 

Pro Pro Leu Giy Ser Pro Pro Pro Lys Thr Thr Val Thr Phe Ser Asp 
325 330 335 

Gly Arg Thr Ala Pro Phe Thr Val Val Gly Ala Asp Pro Thr Ser Asp 
340 345 350 

He Ala Val Val Arg Val Gin Gly Val Ser Gly Leu Thr Pro He Ser 
355 360 365 

Leu Gly Ser Ser Ser Asp Leu Arg Val Giy Gin Pro Val Leu Ala He 
370 375 380 

Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly He Val Ser 
385 390 395 400 

Ala Leu Asn Arg Pro Vai Ser Thr Thr Giy Glu Ala Giy Asn Gin Asn 
405 410 415 

Thr Vai Leu Asp Ala He Gin Thr Asp Ala Ala He Asn Pro Gly Asn 
420 425 430 

Ser (Hy Giy Ala Leu Vai Asn Met Asn Ala C-ln Leu Val Giy Val. Asn 
435 440 445 

Ser Ala He Ala Thr Leu Gly Ala Asp Ser Aia Asp Ala i31n Ser Gly 
450 455 460 

Ser He G.ly Leu G-l y Phe Ala He Pro Va.l Asp Gin Ala Lys Arg He 
465 470 475 480 

Ala Asp Glu Leu He Ser Thr Gly Lys A] a Ser His Aia Ser Leu Gly 
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ASb 490 495 

Val Gin Val Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys lie Val Glu 
500 505 51C 

Val Val Ala Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val 
515 520 525 

Val Val Thr Lys Val Asp Asp Arg Pro lie Asn Ser Ala Asp Ala Leu 
530 535 540 

Val Ala Ala Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr 
545 550 555 560 

Phc Gin Asp Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly 
565 570 575 

Lys Ala Glu Gin 
580 

INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 

Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val Val Leu 

15 10 15 

Gly Ala Cys Leu Ala Lei] Trp Leu Ser Gly Cys Ser Ser Pro Lys Pro 
20 25 30 

Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr Ala Ser Asp Pro 
35 40 45 

Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala Thr Lys Gly Leu 
50 55 60 

Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys Val Asp Ser Leu 
65 70 75 80 

Leu Gly lie Thr Ser Ala Asp Val Asp Val Arg Ala Asn Pro Leu Ala 

8 5 90 95 



Ala Lys GJ y Val Cys Thr Tyr Asn Asp Glu Gin Gly Val Pro Phe Arg 
100 105 110 
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Val Gin Gly Asp Asn lie Ser Val Lys Leu Phe Asp Asp Trp Ser Asn 
115 120 125 

Leu Gly Ser lie Ser Glu Leu Ser Thr Ser Arg Val Leu Asp Pro Ala 
130 135 140 

Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn Leu Gin Ala Gin 
145 150 155 160 

Gly Thr Glu Val lie A.sp GJ y lie Ser Thr Thr Lys lie Thr Gly Thr 
165 170 175 

lie Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly Ala Lys Ser Ala 
180 185 190 

Arg Pro Ala Thr Val Trp lie Ala Gin Asp Gly Ser His His Leu Val 
195 200 205 

Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin Leu Thr Gin Ser 
210 215 220 

Lys Trp Asn Glu Pro Val Asn Val Asp 

225 230 

) INFORMATION FOR SF,Q ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Ki) SEQUENCE DESCRIPTION: SEQ ID NO:78: 

Vdl He Asp He He Gly Thr Ser Pro Thr Ser Trp Glu Gin Ala Ala 
1 5 10 15 

Ala Glu Ala Val Gin Arg Ala Arg Asp Ser Val Asp Asp He Arg Val 
20 25 30 

Ala Arg Val He Glu Gin Asp Met Ala Val Asp Ser Ala Gly Lys lie 
35 40 4 5 

Thr Tyr Arg He Lys Leu Glu Val. Ser Phe Lys Met Arq Pro Ala Gin 
50 55 60 

Pro Arg 
t)5 

INFORMATION FOR SEQ ID NO: 79: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Val Pro Pro Ala Pro Pro Leu Pro Pro Leu Pro Pro Ser Pro lie Ser 
15 10 15 

Cys Ala Ser Pro Pro Ser Pro Pro Leu Pro Pro Ala Pro Pro Val Ala 
20 25 30 

Pro Gly Pro Pro Met Pro Pro Leu Asp Pro Trp Pro Pro Ala Pro Pro 
35 40 45 

Leu Pro Tyr Ser Thr Pro Pro Gly Ala Pro Leu Pro Pro Ser Pro Pro 
50 55 60 

Ser Pro Pro Leu Pro 
65 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Met Scr Asn Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu Leu Ser 

15 10 15 

Val Leu Ala Ala Val Gly Leu Gly Leu Ala Thr Ala Pro Ala Gin Ala 

20 25 30 

Ala Pre Pro Ala Leu Ser Gin Asp Arg Phe Aia Asp Phe Pro Ala Leu 

3 5 4 0 4 5 

Pro Leu Asp Pro Ser Ala Met Val Ala Gin Val Ala Pro Gin Val VaJ 

50 55 60 



Asn He Asn Thr Lys Lou Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr 
65 70 75 80 
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Gly lie Val lie Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Va.! 

85 90 95 

lie Ala Gly Ala Thr Asp lie Asn Ala Phe Ser Val Gly Ser Gly Gin 
100 105 110 

Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gin Asp Val Ala 
115 120 125 

Val Leu Gin Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala He Gly 
130 135 140 

Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser Gly 
145 150 155 160 

Gly Gin Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu 
165 170 175 

Gly Gin Thr Val Gin Ala Ser Asp Ser Leu Thr Gly AJ a Glu Glu Thr 
180 105 190 

Leu Asn Gly Leu He Gin Phe Asp AJ a Ala He Gin Pro Gly Asp Ser 
195 200 205 

Gly Gly Pro Val Val Asn Gly Leu Gly Gin Val Val Gly Met Asn Thr 
210 215 220 

Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe Ala 
225 230 235 240 

He Pro He Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser Gly 
245 250 255 

Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly Leu 
260 265 270 

Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arq Val Gin Arg Val Val 
275 280 285 

Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val He 
290 295 300 

Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala Asp 

305 310 3.15 320 

Ala Leu Asn Gly His His Pro Gly Asp Val Ho Ser Val Asn Trp Gin 
325 330 335 

Thr Lys Ser Gly Gly Thr Arq Thr Gly Asn Val Thr Leu Ala Glu Gly 
340 3^13 350 



Pro Pro Ala 
355 
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(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 205 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:81: 

Ser Pro Lys Pro Asp Ala GIu Glu Gin Gly Val Pro Val Ser Pro Thr 
1 5 10 15 

Ala Ser Asp Pro Ala Leu Leu Ala Glu lie Arg Gin Ser Leu Asp Ala 
20 25 30 

Thr Lys Gly Leu Thr Scr Val His Val Ala Val Arg Thr Thr Gly Lys 
35 40 45 

Val Asp Ser Leu Leu Gly lie Thr Ser Ala Asp Val Asp Val Arg Ala 
50 55 60 

Asn Pro Leu Ala Ala Lys Gly Val Cys Thr Tyr Acn Asp Glu Gin Gly 
65 70 75 80 

Val Pro Phe Arg Val Gin Gly Asp Asn lie Ser Val Lys Leu Phe Asp 
85 90 95 

Asp Trp Ser Asn Leu Gly Ser lie Ser Glu Leu Ser Thr Ser Arg Val 
100 105 110 

Lgu Asp Pro Ala Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn 
115 120 125 

Leu Gin Ala Gin GJ y Thr Glu Val lie Asp Gly He Ser Thr Thr Lys 
130 135 140 

He Thr Gly Thr He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly 
145 150 155 160 

A.] a Lys Ser Ala Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser 
165 170 175 

ills His Leu Val Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin 
180 185 190 

Leu Thr Gin Sor Lys Trp Asn Glu Pro Val Asn Val Asp 
195 200 205 



(2) INFORMATION FOR SEQ ID NO: 82: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 286 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Gly Asp Ser Phe Trp Ala Ala Ala Asp Gin Met Ala Arg Gly Phe Vai 
15 10 15 

Leu Gly Ala Thr Ala Gly Arg Thr Thr Leu Thr Gly Glu Gly Leu Gin 
20 25 30 

His Ala Asp Gly His Ser Leu Leu Leu Asp Ala Thr Asn Pro Ala Val 
35 40 45 

Val Ala Tyr Asp Pro Ala Phe Ala Tyr Glu He Gly Tyr He Xaa Glu 
50 55 60 

Ser Gly Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn He Phe Phe 
65 70 75 80 

Tyr He Thr Val Tyr Asn Glu Pro Tyr Val Gin Pro Pro Glu Pro Glu 
85 90 95 

Asn Phe Asp Pro Glu Gly Val Leu Gly Gly He Tyr Arg Tyr His Ala 
100 105 110 

Ala Thr Glu Gin Arg Thr Asn Lys Xaa Gin He Leu Ala Ser Gly Val 
115 120 125 

Ala Met Pro Ala Ala Leu Arg Ala Ala Gin Met Leu Ala Ala Glu Trp 
130 135 140 

Asp Val Ala Ala Asp Vai Trp Ser Val Thr Ser Trp Gly Glu Leu Asn 
145 150 155 160 

Arg Asp Gly Val Val He Glu Thr Glu Lys Leu Arg His Pro Asp Arg 
165 170 175 

Pro Ala Gly Val Pro Tyr Val Thr Arg Ala Leu Glu Asn Ala Arg Gly 
180 185 190 

Pre. Val He Ala Val Ser Asp Trp Met. Arg Ala Val Pro Glu Gin He 
195 200 205 

Arg Pro Trp Val Pro Gly Thr Tyr Leu Thr Leu Gly Thr Asp Gly Phe 
210 215 220 



Gly Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr Phe Asn Thr Asp 
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225 230 235 240 

Ala Glu Ser Gin Val Gly Arq Gly Phe Gly Arg Gly Trp Pro Gly Arg 
245 250 255 

Arg Val Asn lie Asp Pro Phe Gly Ala Gly Arg Gly Pro Pro Ala Gin 
260 265 270 

Leu Pro Gly Phe Asp Glu Gly GJ y G.l y Leu Arg Pro Xaa Lys 
275 280 285 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: 

Thr Lys Phe His Ala Leu Met Gin Glu Gin lie His Asn Glu Phe Thr 
15 10 15 

Ala Ala Gin Gin Tyr Val Ala He Ala Val Tyr Phe Asp Ser Glu Asp 
20 25 30 

Leu Pro Gin Leu Ala Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg 
35 40 45 

Asn His Ala Met Met Leu Val Gin His Leu Leu Asp Arg Asp Leu Arg 
50 55 60 

Val Glu He Pro Gly Val Asp Thr Val Arg Asn Gin Phe Asp Arq Pro 
65 70 75 80 

Arg Glu Ala Leu Ala Leu Ala Leu Asp Gin Glu Arg Thr Val Thr Asp 
85 90 95 

Gin Val Gly Arq Leu Thr Ala Val Ala Arq Asp Glu Gly Asp Phe Leu 

100 105 110 

Gly Glu Gin Tne Met Gin Trp Phe Leu Gin Glu Gin He Glu Glu Val 
115 120 125 

Ala Leu Met ;.la Thr Leu Val. Arg Val AJ .n Asp Arg AJ a Gly Ala Asn 
130 135 140 



Leu Phe Glj Leu Glu Asn Phe Val Ala Arq Glu Vai Asp Val Ala Pro 
145 150 155 160 
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Ala Ala Ser Gly Ala Pro His Ala Ala Gly Gly Arg Leu 
165 170 

(2) INFORMATION FOR SFQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:B4: 

Arg Ala Asp Glu Arg Lys Asn Thr Thr Met Lys Mot Val Lys Ser He 
15 10 15 

Ala Ala Gly Leu Thr Ala Ala Ala Ala He Gly Ala Ala Ala Ala Gly 
20 25 30 

Val Thr Ser lie Met Ala Gly Gly Pro Val Val Tyr Gin Met Gin Pro 
35 40 45 

Val Val Phe Gly Ala Pro Leu Pro Leu Asp Pro Xaa Ser Ala Pro Xaa 
50 55 60 

Val Pro Thr Ala Ala Gin Trp Thr Xaa Leu Leu Asn Xaa Leu Xaa Asp 
65 70 75 80 

Pro Asn Val Ser Phe Xaa Asn Lys Gly Ser Leu Val Glu Gly Gly Tie 
85 90 95 

Gly Gly Xaa Glu Gly Xaa Xaa Arg Arg Xaa Gin 
100 105 

(2) INFORMATION rOB SEQ ID NO : 8 5 : 
(i) SEQUENCE CHARACTERISTICS: 

[A} LENGTH: 12 5 amino acids 
(Bj TYPE: amino acid 
(Cj STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SLCUENCF DESCRIPTION: SEQ ID NO: 85: 

Vdl Leu Ser VaJ. Pro Val Gly Asp Gly Phe Trp Xaa Arg Val Val Asn 
1 5 10 15 

Pro Lou Gly Gin Pro He Asp Gly Arg G.l y Asp Val Asp Ser Asp Thr 
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20 2b 30 

Arq Arg Ala Leu Glu Leu Gin Ala Pro Ser Val Val Xaa Arg Gin Gly 
35 AO 45 

Val Lys Glu Pro Leu Xaa Thr Gly He Lys Ala He Asp Ala Met Thr 
50 55 60 

Pro He Gly Arg Gly Gin Arg Gin Leu He He Gly Asp Arg Lys Thr 
65 70 75 80 

Gly Lys Asn Arg Arg Leu Cys Arg Thr Pro Ser Ser Asn Gin Arg Glu 
85 90 95 

Glu Leu Gly Vai Arg Trp He Pro Arg Ser Arq Cys Ala Cys Val Tyr 
100 105 110 

Val Gly His Arg Ala Arg Arg Gly Thr Tyr His Arg Arg 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: 

Cys Asp Ala Val Met Gly Phe Leu Gly Gly Ala Gly Pro Leu Ala Val 
1 5 10 15 

Val Asp Gin Gin Leu Val Thr Arg Val Pro Gin Gly Trp Ser Phe Ala 
20 25 30 

Gin Ala Ala Ala Vai Pro Val Val Phe Leu Thr Ala Trp Tyr Gly Leu 
35 40 45 

Ala Asp Leu Ala Glu He Lys Ala Gly Glu Ser Val Leu He His Ala 
50 55 60 

Gly Thr Gly Gly Val Gly Met Ala Ala Val Gin Leu Ala Arg Gin Trp 

65 70 75 80 

Gly Val Glu Val Phe Val Thr Ala Ser Arg Gly Lys Trp Asp Thr Leu 

35 90 95 



Arg Ala Xaa Xaa Phe Asp Asp Xaa Pro Tyr Arg Xaa Phe Pro His Xaa 
100 105 110 
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Arg Ser Ser Xaa Gly 
115 

(2) INFORMATION FOR 5EQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: 

Met Tyr Arg Phe Ala Cys Arg Thr Leu Met Leu Ala Ala Cys lie Leu 
lb 10 15 

Ala Thr Gly Val Ala Gly Leu Gly Val Gly Ala Gin Ser Ala Ala Gin 
20 25 30 

Thr Ala Pro Val Pro Asp Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp 
35 'lO 45 

Pro Ala Trp Gly Pro Asn Trp Asp Pro Tyr Thr Cys His Asp Asp Phe 
50 55 60 

His Arg Asp Ser Asp Gly Pro Asp His Ser Arg Asp Tyr Pro Gly Pro 
65 70 75 80 

lie Leu Glu Gly Pro Val Leu Asp Asp Pro Gly Ala Ala Pro Pro Pro 
85 90 95 

Pro Ala Ala Gly Gly Gly Ala 
100 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEC^UENCE CHARACTERISTICS: 

(A) LENGTH: 88 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 B : 

Val Gin Cys Arg Val Trp Leu Glu lie Gin Trp Arg Gly Met Leu Gly 
1 5 10 15 
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Ala Asp Gin Ala Arg Ala Gly Gly Pro Ala Arg lie Trp Arg Glu His 

20 25 30 

Ser Met Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala 
35 40 45 

Thr Lys Glu Gly Arg Gly He Val Met Arg Val Pro Leu Glu Gly Gly 
50 55 60 

Gly Arg Leu Val VaJ Glu Leu Thr Pro Asp Glu Ala Ala Ala Leu Gly 

65 70 75 80 

Asp Glu Leu Lys Gly Val Thr Ser 

85 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SFQUENCE CHARACTERISTICS: 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:89: 

Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He 
15 10 15 

Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly 

20 25 30 

Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala 
35 40 45 

Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu 
50 55 60 

Asp Glu Ho Scr Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg 
65 70 75 80 

Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
85 90 95 

(2) TNFORMATION FOR SEQ TO NO: 90: 

(i) SEQUENCE CHAR/^CTERISTTCS: 

(A) LENGTH: 166 amino acids 
(R) I'YPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:yU: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu He Leu 

15 10 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp 

20 25 30 



Asn 

Val 

IlG Thr Pro Cys Glu Leu Thr Xaa Xaa Lys Asn Ala Ala Gin Gin 



Pro 

35 40 

Xaa Val Leu Ser Ala Asp Asn Met 
50 55 

Lys Glu Arg Gin Arg Leu Ala Thr 
65 7 0 

Tyr Gly Glu Val Asp Glu Glu Ala 
85 

Glu Gly Thr Val Gin Ala Glu Ser 
100 

Ser Ala Glu Leu Thr Asp Thr Pro 
115 120 

Asn Phe Met Asp Leu Lys Glu Ala 
130 135 

Gin Gly Ala Ser Leu Ala His Xaa 
145 150 

Leu Thr Leu Gin Gly Asp 

165 

(2) INFORIXIATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) 5TRANDEDNESS: single 

(D) TOPOLOGY: linear 



45 

Arg Glu Tyr Leu Ala Ala Giy Ala 
60 

Ser Leu Arg Asn Ala Ala Lys Xaa 
75 80 

Ala Thr Ala Leu Asp Asn Asp Gly 
90 95 

Ala Gly Ala Val Gly Gly Asp Ser 
105 110 

Arg Val Ala Thr Ala Gly Glu Pro 
125 

Ala Arg Lys Leu Glu Thr Gly Asp 
140 

Gly Asp Gly Trp Asn Thr Xaa Thr 

155 160 



(XI.) SEQUENCE OKSCRIPTION: SEQ ID NO: 91: 

Arg Ala Glu Arg Met 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala 
15 10 15 

Gin Val Arq Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr 
20 25 30 

VaJ Pro Pro Pro Val lie Ala Glu Asn Arq Ala Glu Leu Met He Leu 
35 40 45 

He Ala Thr Asn Leu Leu Gly Gin Asn Thr Pro Ala He Ala Val Asn 
50 55 60 

Glu Ala Glu Tyr Gly Glu Met Trp Ala Gin Asp Ala Ala Ala Met Phe 
65 70 75 80 

Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 
85 90 95 

Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gin Ala 
100 105 110 

Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gin Leu Met 
115 120 125 

Asn Asn VaJ Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr Gin Gly 
130 135 I'lO 

Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 
145 150 155 160 

His Arg Ser Pro Tie Ser Asn Met Val Ser Met Ala Asn Asn His Met 
165 170 175 

Ser Met Thr Asn Ser Gly Val Ger Mot Thr Asn Thr Leu Ser Scr Met 
180 1B5 190 

Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala Gin Ala Val Gin Thr Ala 
195 200 205 

Ala Gin Asn Gly Va T Arg Ala Met Ser Scr Leu Gly Ser Ger Leu Gly 
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210 215 220 

Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala 
225 230 235 240 

Ser Val Arg Tyr Gly His Arg Asp G.l y Gly Lys Tyr Ala Xaa Ser Gly 
245 250 255 

Arg Arg Asn Gly Gly Pro Ala 

260 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Met Thr Tyr Ser Pro Gly Asn Pro Gly Tyr Pro Gin Ala Gin Pro Ala 
15 10 15 

Gly Ser Tyr Gly Gly Val Thr Pro Ser Phe Ala His Ala Asp Glu Gly 
20 25 30 

Ala Ser Lys Leu Pro Met Tyr Leu Asn He Ala Val Ala Val Leu Gly 
35 40 45 

Leu Ala Ala Tyr Phe Ala Ser Phe Gly Pro Met Phe Thr Leu Ser Thr 
50 55 60 

Glu Leu Gly Gly Gly Asp Gly Ala Val Ser Gly Asp Thr Gly Leu Pro 
65 70 75 80 

Val Gly Val Ala Leu Leu Ala Ala Leu Leu Ala Gly Val Val Leu Val 
85 90 95 

Pro Lys Ala Lys Ser His Val Thr Val Val Ala Val Leu Gly Val Leu 
100 105 110 

Gly Val Pho Leu Met Val Ser Ala Thr Phe Asn Lys Pro Ser Ala Tyr 
115 120 125 

Ser Thr Gly Trp Ala Leu Trp Va i Val I.eu Ala Phe Tie Val Phe Gin 
130 135 140 



Ala Val Ala Ala Val Leu Ala Leu Leu Val GJu Thr Gly Ala He Thr 
145 150 155 160 
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Ala Pro Ala Pro Arg Pro Lys Phe Asp Pro Tyr Gly Gin Tyr Gly Arg 
16b 170 175 

Tyr Gly Gin Tyr Gly Gin Tyr Gly Val Gin Pro Gly Gly Tyr Tyr Gly 
180 185 190 

Gin Gin Gly Ala Gin Gin Ala Ala Gly Leu Gin Ser Pro Gly Pro Gin 
195 200 205 

Gin Ser Pro Gin Pro Pro Gly Tyr Gly Ser Gin Tyr Gly Gly Tyr Ser 
210 215 220 

Ser Ser Pro Ser Gin Ser Gly Ser Gly Tyr Thr Ala Gin Pro Pro Ala 
225 230 235 240 

Gin Pro Pro Ala Gin Ser Gly Ser Gin Gin Ser His Gin Gly Pro Ser 
245 250 255 

Thr Pro Pro Thr Gly Phe Pro Ser Phe Ser Pro Pro Pro Pro Val Ser 
260 265 270 

Ala Gly Thr Gly Ser Gin Ala Gly Ser Ala Pro Val Asn Tyr Ser Asn 
275 280 285 

Pro Ser Gly Gly Glu Gin Ser Ser Ser Pro Gly Gly Ala Pro Val 
290 295 300 

{2} INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A] LENGTH: 507 base pairs 

(B) TYPE: nucleic acid 

(C] STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTTON: SEQ ID NO: 94: 

ATGAAGATGG TGAAATCGAT CGCCGCAGGT CTGACCGCCG CGGCTGCAAT CGGCGCCGCT 60 

GCGGCCGGTG TGACTTCGAT CATGGCTGGC GGCCCGGTCG TATACCAGAT GCAGCCGGTC 120 

GTCTTCGGCG CGCCACTGCC GTTGGACGCG i^CATCCGCCC CTGACGTCCG GACCGCCGCC 180 

CAGTTGACCA GCCTGCTCAA CAGCCTCGCC GATCCCAACG TGTCGTTTGC GAAC/VAGGGC 24 0 

AGTCTGGTCG AGGGCGGCAT CGGGGGCACC GAGGCGCioCA TCGCCi^ACCA CAAGCTGAAG 300 

AAGGCCGCCG AGCACGGGGA TCT:;CCGCTG TCGTTCAGCG T-^.ACGAACAT CCAGCCGGCG 3 60 

GCCGCCGGTT CGGCCACGGC CGA:GTTTCC GTCTCGGGTC CGAAGCTCTC GTCGCCGGTC 420 
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ACGCAGAACG TCACGTTCGT GMTCAAGGC GGCTGGATGC TGTCACGCGC ATCGGCGATG 4 80 

GAGTTGCTGC AGGCCGCAGG GAACTGA 507 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 amino acids 
(H) TYPE: amino acid 

(C) STR.ANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Met Lys Met Val Lys Ser lie Ala Ala Gly Leu Thr Ala Ala Ala Ala 
15 10 15 

He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala Gly Gly Pro 
20 25 30 

Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro Leu Pro Leu 
35 40 45 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
50 55 60 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn Lys Gly 
65 70 75 80 

Ser Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg He Ala Asp 
85 90 95 

His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro Leu Ser Phe 
100 ' 105 110 

Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala Thr Ala Asp 
115 120 125 

Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr Gin Asn Val 
130 135 1^0 

Thr Phe Val Asn Gin Gly Gly Trp Met Lou Ser Arg Ala Ser Ala Met 
145 150 155 160 

Glu Leu Leu Gin Ala Ala Gly Asn 
165 

(2) INFORMATION FOR SEQ ID NO: 96: 



{[) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
CGTGGCAATG TCGTTGACCG TCGGGGCCGG GGTCGCCTCC GCAGATCCCG TGGACGCGGT 60 

CATTAACACC ACCTGC7VATT ACGGGCAGGT AGTAGCTGCG CTC/UVCGCGA CGGATCCGGG 12 0 

GGCTGCCGCA CAGTTCAACG CCTCACCGGT GGCGCAGTCC TATTTGCGCA ATTTCCTCGC 18 0 

CGCACCGCCA CCTCAGCGCG CTGCCATGGC CGCGCAATTG CAAGCTGTGC CGGGGGCGGC 24 0 

ACAGTACATC GGCCTTGTCG AGTCGGTTGC CGGCTCCTGC AACAACTATT AAGCCCATGC 3 00 

GGGCCCCATC CCGCGACCCG GCATCGTCGC CGGGGCTAGG CCAGATTGCC CCGCTCCTCA 3 60 

ACGGGCCGCA TCCCGCGACC CGGCATCGTC GCCGGGGCTA GGCCAGATTG CCCCGCTCCT 420 

CAACGGGCCG CATCTCGTGC CGAATTCCTG CAGCCCGGGG GATCCACTAG TTCTAGAGCG 4 80 

GCCGCCACCG CGGTGGAGCT 50 0 
(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE:, amino, acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Val Ala Met Ser Leu Thr Val G.l y Ala Gly Val Ala Ser Ala Asp Pro 
15 10 15 

Val Asp Ala Val lie Asn Thr Thr Cys Asn Tyr Gly Gin Val Val Ala 
20 25 30 

Ala Leu Asn Ala Thr Asp Pro Gly Ala Ala Ala Gin Phe Asn Ala Ser 
35 40 45 



Pro Val Ala Gin Ser Tyr Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro 
50 55 60 
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Gin Arg Ala Ala Met Ala Ala Gin Leu Gin Ala Val Pro Gly Ala Ala 
65 70 75 80 

Gin Tyr lie Gly Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 
85 90 95 

(2} INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
ATGACAGAGC AGCAGTGGAA TTTCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCCAGGGA 6 0 

AATGTCACGT CCATTCATTC CCTCCTTGAC GAGGGGAAGC AGTCCCTGAC C7VAGCTCGCA 120 
GCGGCCTGGG GCGGTAGCGG TTCGGAAGCG TACC 154 
(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Met Thr Glu Gin Gin Trp Asn Phe Ala Gly lie Glu Ala Ala Ala Ser 
15 10 15 

Ala He Gin Gly Asn Val Thr Ser He His Ser Leu Leu Asp Glu Gly 

20 25 30 

Lys Gin Ser Leu Thr Lys Lou Ala Ala Ala Trp Gly Gly Ser Gly Ser 
35 40 45 

Glu Ala Tyr 

5 0 

(2) INFORMATION FOR SEQ ID NO: 100: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 282 base pairs 

(B) TYPK: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

CGGTCGCGCA CTTCCAGGTG ACTATGAAAG TCGGCTTCCG NCTGGAGGAT TCCTGAACCT 60 

TCAAGCGCGG CCGATAACTG AGGTGCATCA TTAAGCGACT TTTCCAGAAC ATCCTGACGC 120 

GCTCGAAACG CGGCACAGCC GACGGTGGCT CCGNCGAGGC GCTGNCTCCA AAATCCCTGA 180 

GACAATTCGN CGGGGGCGCC TACAAGGAAC TCGGTGCTGA ATTCGNCGNG TATCTGGTCG 24 0 

ACCTGTGTGG TCTGNAGCCG GACGAAGCGG TGCTCGACGT CG 2 82 
(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

GATCGTACCC GTGCGAGTGC TCGGGCCGTT TGAGGATGGA GTGCACGTGT CTTTCGTGAT 60 

GGCATACCCA GAGATGTTGG CGGCGGCGGC TGACACCCTG CAGAGCATCG GTGCTACCAC 120 

TGTGGCTAGC AATGCCGCTG CGGCGGCCCC GACGACTGGG GTGGTGCCCC CCGCTGCCGA 180 

TGAGGTGTCG GCGCTGACTG CGGCGCACTT CGCCGCACAT GCGGCGATGT ATCAGTCCGT 24 0 

GAGCGCTCGG GCTGCTGCGA TTCATGACCA GTTCGTGGCC ACCCTTGCCA GCAGCGCCAG 300 

CTCGTATGCG GCCACTGAAG TCGCCAATGC GGCGGCGGCC AGCTAAGCCA GGAACAGTCG 3 60 

GCACGAGAAA CCACGAGAAA TAGGGACACG TMTGGTGGA TTTCGGGGCG TTACCACCGG 4 20 

AGATCAACT': CGCGAGGATG TACGCCGGCC CGGGTTCGGC CTCGCTGGTG GCCGCGGCTC 4 80 

AGATGTGGGA CAGCGTGGCG AGT":^.ACCTGT TTTCI^GCCGC GTCGGCGTTT CAGTCGGTGG 540 

TCTGGGGTCT GACGGTGGGG TCGTGGATAG GTTC:^TCGGC GGGTCTGATG GTGGCGGCGG 600 
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CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA GGCCGAGCTG ACCGCCGCCC 660 

AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG GCTGACGGTG CCCCCGCCGG 720 

TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC GACCAACCTC TTGGGGCAAA 7 BO 

ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGCGA GATGTGGGCC CAAGACGCCG 84 0 

CCGCGATGTT TGGCTACGCC GCGGCGACGG CGACGGCGAC GGCGACGTTG CTGCCGTTCG 900 

AGGAGGCGCC GGAGATGACC AGCGCGGGTG GGCTCCTCGA GCAGGCCGCC GCGGTCGAGG 960 

AGGCCTCCGA CACCGCCGCG GCGAACCAGT TGATGAACAA TGTGCCCCAG GCGCTGCAAC 1020 

AGCTGGCCCA GCCCACGCAG GGCACCACGC CTTCTTCCAA GCTGGGTGGC CTGTGGAAGA 108 0 

CGGTCTCGCC GCATCGGTCG CCGATCAGCA ACATGGTGTC GATGGCCAAC AACCACATGT 114 0 

CGATGACCAA CTCGGGTGTG TCGATGACCA ACACCTTGAG CTCGATGTTG AAGGGCTTTG 1200 

CTCCGGCGGC GGCCGCCCAG GCCGTGCAAA CCGCGGGGCA AAACGGGGTC CGGGCGATGA 12 60 

GCTCGCTGGG CAGCTCGCTG GGTTCTTCGG GTCTGGGCGG TGGGGTGGCC GCCAACTTGG 132 0 

GTCGGGCGGC CTCGGTCGGT TCGTTGTCGG TGCCGCAGGC CTGGGCCGCG GCCAACCAGG 138 0 

CAGTCACCCC GGCGGCGCGG GCGCTGCCGC TGACCAGCCT GACCAGCGCC GCGGAAAGAG 14 4 0 

GGCCCGGGCA GATGCTGGGC GGGCTGCCGG TGGGGCAGAT GGGCGCCAGG GCCGGTGGTG 1500 

GGCTCAGTGG TGTGCTGCGT GTTCCGCCGC GACCCTATGT GATGCCGCAT TCTCCGGCGG 15 60 

CCGGCTAGGA GAGGGGGCGC AGACTGTCGT TATTTGACCA GTGATCGGCG GTCTCGGTGT 162 0 

TTCCGCGGCC GGCTATGACA ACAGTCAATG TGCATGACAA GTTACAGGTA TTAGGTCCAG 1680 

GTTCAACAAG GAGACAGGCA ACATGGCCTC ACGTTTTATG ACGGATCCGC ACGCGATGCG 17']0 

GGACATGGCG GGCCGTTTTG AGGTGCACGC CCAGACGGTG GAGGACGAGG CTCGCCGGAT 1800 

GTGGGCGTCC GCGCAAAACA TTTCCGGTGC GGGCTGGAGT GGCATGGCCG AGGCGACCTC 18 60 

GCTAGACACC ATGGCCCAGA TGAATCAGGC GTTTCGCAAC ATCGTGMCA TGCTGCACGG 1920 

GGTGCGTGAC GGGCTGGTTC GCGACGCCA.A CAACTACGAG CAGCAAGAGC AGGCCTCCGA 198 0 

GCAGATCCTC AGCAGCTTWC GTCAGCCGCT GCAGCACAAT ACTTTTACAA GCGAAGGAGA P.OAO 

ACAGGTTCCA TGACCATC7y\ CTATCAATTC GGGGATGTCG ACGdv:ACGG CGCCATGATC JlnO 

CGCGCTCAGG L^CGGGTTGCT GGAGGCCGAG CATCAGGCCA TCATTCGTGA TGTGTTGACC 21(-.0 

GCGAGTGACT TTTGGGGCGG CGCCGGTTCG GCGGCCTGCC AGGGGTTCAT TACCCAGTTG 222 0 

GGCCGTAACT TCCAGGTGAT CTACGAGCAG GCCAACGCCC ACGGGCAGAA GGTGCAGGCT 2 2 80 
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GCCGGCAACA ACATGGCGCA AACCGACAGC GCCGTCGGCT CCAGCTGGGC CTGACACCAG 2 310 

GCCAAGGCCA GGGACGTGGT GTACGAGTGA AGTTCCTCGC GTGATCCTTC GGGTGGCAGT 24 0 0 

CTAAGTGGTC AGTGCTGGGG TGTTGGTGGT TTGCTGCTTG GCGGGTTCTT CGGTGCTGGT 24 60 

CAGTGCTGCT CGGGCTCGGG TGAGGACCTC GAGGCCCAGG TAGCGCCGTC CTTCGATCCA 2520 

TTCGTCGTGT TGTTCGGCGA GGACGGCTCC GACGAGGCGG ATGATCGAGG CGCGGTCGGG 2580 

GAAGATGCCC ACGACGTCGG TTCGGCGTCG TACCTCTCGG TTGAGGCGTT CCTGGGGGTT 2 64 0 

GTTGGACCAG ATTTGGCGCC AGATCTGCTT GGGGAAGGCG GTG7\ACGCCA GCAGGTCGGT 2700 

GCGGGCGGTG TCGAGGTGCT CGGCCACCGC GGGGAGTTTG TCGGTCAGAG CGTCGAGTAC 27 60 

CCGATCATAT TGGGCAACAA CTGATTCGGC GTCGGGCTGG TCGTAGATGG AGTGCAGCAG 2820 

GGTGCGCACC CACGGCCAGG AGGGCTTCGG GGTGGCTGCC ATCAGATTGG CTGCGTAGTG 28 8 0 

GGTTCTGCAG CGCTGCCAGG CCGCTGCGGG CAGGGTGGCG CCGATCGCGG CCACCAGGCC 294 0 

GGCGTGGGCG TCGCTGGTGA CCAGCGCGAC GCCGGACAGG CCGCGGGCGA CCAGGTCGCG 3000 

GAAGAACGCC AGCCAGCCGG CCCCGTCCTC GGCGGAGGTG ACCTGGATGC CCAGGATC 305 8 
(2) INFORMATION FOR SEQ ID NO: 102: 

( i) SEQUENCE CPIARACTERTSTICS : 

(A) LENGTH: 391 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met Val Asp Phe Gly Ala Leu Pro Pro GJ u He Asn Ser Ala Arg Met 

15 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Gin Met Trp 

20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 

3 5 4 0 4b 

Val Val Trp Gly Leu Thr Val CM y Snr Trp He Gty Scr Ser Ala Gly 

50 55 60 



Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 80 
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Ala G.ly Gin Ala Glu Leu Thr Ala Ala Gin Va] Arg Val Ala Ala Ala 
85 90 95 

A]. a Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val lie Ala 
100 105 110 

Glu Asn Arg Ala Glu Leu Met lie Leu lie Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala lie Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr Ala 
145 150 155 160 

Thr Ala Thr Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr 
165 170 175 

Ser Ala Gly Gly Leu Leu Glu Gin Ala Ala Ala Val Glu Glu Ala Ser 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Gin Gly Thr Thr Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro lie Ser Asn 
225 230 235 240 

Met Val Ser Met Ala Asn Asn His Met Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Thr Asn Thr Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Ala Gin Ala Val Gin Thr Ala Ala Gin Asn Gly Val Arg Ala 
275 280 285 

Met Ser Ser Leu G.ly Ser Ser Leu Gly Ser Ser Gly Leu Gly Gly Gly 
290 295 300 

Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val 
305 310 315 320 

Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro Ala Ala Arg 
325 330 335 

Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro Gly 
340 345 350 

Gin Met Leu Gly Gly Leu Pro Val Gly Gin Met Gly Ala Arg Ala Gly 
355 360 365 
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Gly Gly Leu Ser Gly Val Leu Arg Val Pro Pro Arg Pro Tyr Val Met 
370 375 380 

Pro ills Ser Pro Ala Ala Gly 
385 390 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

GACGTCAGCA CCCGCCGTGC AGGGCTGGAG CGTGGTCGGT TTTGATCTGC GGTCAAGGTG 60 

ACGTCCCTCG GCGTGTCGCC GGCGTGGATG CAGACTCGAT GCCGCTCTTT AGTGCAACTA 120 

ATTTCGTTGA AGTGCCTGCG AGGTATAGGA CTTCACGATT GGTTAATGTA GCGTTCACCC IRO 

CGTGTTGGGG TCGATTTGGC CGGACCAGTC GTCACCAACG CTTGGCGTGC GCGCCAGGCG 24 0 

GGCGATCAGA TCGCTTGACT ACCAATCAAT CTTGAGCTCC CGGGCCGATG CTCGGGCTAA 300 

ATGAGGAGGA GCACGCGTGT CTTTCACTGC GCAACCGGAG ATGTTGGCGG CCGCGGCTGG 3 60 

CGAACTTCGT TCCCTGGGGG CAACGCTGAA GGCTAGCAAT GCCGCCGCAG CCGTGCCGAC 4 20 

GACTGGGGTG GTGCCCCCGG CTGCCGACGA GGTGTCGCTG CTGCTTGCCA CAC7VATTCCG 4 80 

TACGCATGCG GCGACGTATC AGACGGCCAG CGCCAAGGCC GCGGTGATCC ATGAGCAGTT 54 0 

TCTGACCAC^:; CTGGCCACCA GCGCTAGTTC ATATGCGGAC ACCGAGGCCG CCAACGCTGT 600 

GGTCACCGGC TAi^CTGACCT GACGGTATTC GAGCGGMGG ATTATCGAAG TGGTGGATTT 660 

CGGGGCGTTA CCAGCGGAGA TCAACTCCGC GAGGATGTAC GCCGGCCCGG GTTCGGCCTC 72 0 

GCTGGTGGCC GCCGCGAAGA TGTGGGACAG CGTGGCGAGT GACCTGTTTT CGGCCGCGTC 780 

GGCGTTTCAG TCGGTGGTCT GGGGTCTGAi: GGTGGGGTCG TGGATAGGTT CGTCGGCGGG 8 '10 

TCT ^ATGG'"^'"; GC'";~'.:GGCCT ''JGCCGTATGT f:.";L;CCTGGATG AGCGTCAiJJG C:^GGGCAGGC 900 

CCA'^CTGA''^'': Grc^rCCh^'^G ri':CGGGTTG': TGCGGCGGCC TAi;:l;AGACAG Cv^TATAGGCT 9()0 

GACGGTGCCC CCG :CGGTi"^,A TCGCCGAGAA GCGTACCGAA CTGATGACGC T'.^ACO:^CGAC 1020 

CAAGCTCTT'.; GGGGAAMCA Ci:XCGGCGAT CGAGGCC7\AT CAGGCCGCAT ACAGCCAGAT 108 0 
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GTGGGGCCAA GACGCGGAGG CGATGTATGG CTACGCCGCC ACGGCGGCGA CGGCGACCGA 114 0 

GGCGTTGCTG CCGTTCGAGG ACGCCCCACT GATCACCAAC CCCGGCGGGC TCCTTGAGCA 1200 

GGCCGTCGCG GTCGAGGAGG CCATCGACAC CGCCGCGGCG AACCAGTTGA TGAACAATGT 12 60 

GCCCCAAGCG CTGCAACAGC TGGCCCAGCC AGCGCAGGGC GTCGTACCTT CTTCCAAGCT 132 0 

GGGTGGGCTG TGGACGGCGG TCTCGCCGCA TCTGTCGCCG CTCAGCAACG TCAGTTCGAT 138 0 

AGCCAACAAC CACATGTCGA TGATGGGCAC GGGTGTGTCG ATGACCAACA CCTTGCACTC 14 AO 

GATGTTGAAG GGCTTAGCTC CGGCGGCGGC TCAGGCCGTG GAAACCGCGG CGGAAAACGG 15 00 

GGTCTGGGCG ATGAGCTCGC TGGGCAGCCA GCTGGGTTCG TCGCTGGGTT CTTCGGGTCT 15 60 

GGGCGCTGGG GTGGCCGCCA ACTTGGGTCG GGCGGCCTCG GTCGGTTCGT TGTCGGTGCC 162 0 

GCCAGCATGG GCCGCGGCCA ACCAGGCGGT CACCCCGGCG GCGCGGGCGC TGCCGCTGAC 168 0 

CAGCCTGACC AGCGCCGCCC AAACCGCCCC CGGACACATG CTGGG 172 5 
(2) INFORMATION FO? SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 amino acids 

(B) TYPZ: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Val Val Asp ^'he G.1 y Ala Leu Pro Pro Giu T Le Asn Ser Ala Arg Met 
1 b 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Trp 

20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala A.l.a Ser AJ a Phe Gin Ser 
B5 40 45 

Val Val Trp Gly Leu Thr Val Gly Ser Trp lie Gly Ser Ser Ala Gly 

50 55 60 

Leu Met AJ a .Ma Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 

65 ^0 75 BO 



Ala Gly Gin Ala Gin Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
8 5 90 95 
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Ala Tyr Glu Thr Ala Tyr Arg Leu Thr Val Pro Pro Pro Vai lie Ala 
100 105 110 

Glu Asn Arg Thr Glu Leu Met Thr Leu Thr Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala lie Glu Ala Asn Gin Ala Ala Tyr Ser Gin Met 
130 135 140 

Trp Gly Gin Asp Ala Glu Ala Met Tyr Gly Tyr Ala Ala Thr Ala Ala 
145 150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu lie Thr 
165 170 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala He 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Ala Gin Gly Val Val Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Thr Ala Val Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

Val Ser Ser He Ala Asn Asn His Met Ser Met Met Gly Thr Gly Val 
245 250 255 

Ser Met Thr Asn Thr Leu His Ser Met Leu Lys Gly Leu Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Glu Asn Gly Val Trp Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 

Gly Ala Gly Val Ala Ala Asn I,eu Gly Arg Ala Ala Ser Val Gly Ser 
305 310 315 320 

Leu Ser Val Pro Pro Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly 
:i55 

} INFORMATION FOR SEQ ID NO: 105: 



(.1} SFQUENCE CHARACTERISTICS: 

(A) LENGTH: 3027 base pairs 
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(B) TYPK: nucleic acid 
{C} STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: 

AGTTCAGTCG AGAATGATAC TGACGGGCTG TATCCACGAT GGCTGAGACA ACCGAACCAC 60 

CGTCGGACGC GGGGACATCG C7\AGCCGACG CGATGGCGTT GGCCGCCGAA GCCGAAGCCG 120 

CCGAAGCCGA AGCGCTGGCC GCCGCGGCGC GGGCCCGTGC CCGTGCCGCC CGGTTGAAGC 180 

GTGAGGCGCT GGCGATGGCC CCAGCCGAGG ACGAGAACGT CCCCGAGGAT ATGCAGACTG 24 0 

GGAAGACGCC GAAGACTATG ACGACTATGA CGACTATGAG GCCGCAGACC AGGAGGCCGC 300 

ACGGTCGGCA TCCTGGCGAC GGCGGTTGCG GGTGCGGTTA CCAAGACTGT CCACGATTGC 3 60 

CATGGCGGCC GCAGTCGTCA TCATCTGCGG CTTCACCGGG CTCAGCGGAT ACATTGTGTG 4 20 

GCAACACCAT GAGGCCACCG AACGCCAGCA GCGCGCCGCG GCGTTCGCCG CCGGAGCCAA 4 80 

GCAAGGTGTC ATCAACATGA CCTCGCTGGA CTTCAACAAG GCCAAAGAAG ACGTCGCGCG 54 0 

TGTGATCGAC AGCTCCACCG GCGAATTCAG GGATGACTTC CAGCAGCGGG CAGCCGATTT 600 

CACCAAGGTT GTCGAACAGT CCAAAGTGGT CACCGAAGGC ACGGTGAACG CGACAGCCGT 660 

CGAATCCATG 7VACGAGCATT CCGCCGTGGT GCTCGTCGCG GCGACTTCAC GGGTCACCAA 72 0 

TTCCGCTGGG GCGAAAGACG AACCACGTGC GTGGCGGCTC AAAGTGACCG TGACCGAAGA 7 80 

GGGGGGACAG TACAAGATGT CGAAAGTTGA GTTCGTACCG TGACCGATGA CGTACGCGAC 84 0 

GTCAACACGG AAACCACTGA CGCCACCGAA GTCGCTGAGA TCGACTCAGC CGCAGGCGAA 90 0 

GCCGGTGATT CGGCGACCGA GGCATTTGAC ACCGACTCTG CAACGGAATC TACCGCGCAG 960 

AAGGGTCAGC GGCACCGTGA CCTGTGGCGA ATGCAGGTTA CCTTGAAACC CGTTCCGGTG 102 0 

ATTCTCATOC TCCTCATGTT GATCTCTGGG GGCGCGACGG GATGGCTATA CCTTGAGCAA 1080 

TACGACCCi:;A TCAGCAGACG GACTCCGGCG CCGCCCGTGC TGCCGTCGCC GCGGCGTCTG 1140 

ACGGGACAAT CGCGCTGTIC TGTATTCAC: CGAGACGTCG ACCAAGACTT GGCTAGCGCC 1200 

AGGTCGCACC TCGCCGGC<'^A TTTCCTGTC': TATACGACCA GTTCACGCA3 CAGAl'GGTGG 12 60 

CTCCGGCG:;C CAAACAGAAG TCACTGvWVA CCACi:GCCAA GGTGGTGCG: GCGGCCGTGT 1320 

CGGAGGTA'::A TCCGGATTCG GCCGTCGTTC TGGTTTTTGT CGACCAGAGC ACTACCAGTA 1380 
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AGGACAGCCC 


caatccgtcg 


ATGGCGGCCA 


GCAGCGTGAT 


GGTGACCCTA 


GCCAAGGTCG 


1440 


ACGGCAATTG 


GCTGATCACC 


AAGTTCACCC 


CGGTTTAGGT 


TGGCGTAGGC 


GGTCGCCAAG 


1500 


TCTGACGGGG 


GCGCGGGTGG 


CTGCTCGTGC 


GAGATACCGG 


CCGTTCTCCG 


GACAATCACG 


1560 


GCCCGACCTC 


AAACAGATCT 


CGGCCGCTGT 




GGGTTATTTA 


AGATTAGTTG 


1620 


CCACTGTATT 


TACCTGATGT 


TCAGATTGTT 


GAGCTGGATT 


TAGCTTCGCG 


GCAGGGCGGC 


1680 


TGGTGCACTT 


TGCATCTGGG 


GTTGTGACTA 


CTTGAGAGAA 


TTTGACCTGT 


TGCCGACGTT 


1740 


GTTTGCTGTC 


CATCATTGGT 


GCTAGTTATG 


GCCGAGCGGA 


AGGATTATCG 


AAGTGGTGGA 


18 00 


CTTCGGGGCG 


TTACCACCGG 


A'3ATCAACTC 


CGCGAGGATG 


TACGCCGGCC 


CGGGTTCGGC 


1860 


CTCGCTGGTG 


gccgccgcga 


AGATGTGGGA 


CAGCGTGGCG 


AGTGACCTGT 


TTTCGGCCGC 


1920 


GTCGGCGTTT 


CAGTCGGTGG 


TCTGGGGTCT 


GACGACGGGA 


TCGTGGATAG 


GTTCGTCGGC 


1980 


GGGTCTGATG 


GTGGCGGCGG 


CCTCGCCGTA 


TGTGGCGTGG 


ATGAGCGTCA 


CCGCGGGGCA 


2040 


GGCCGAGCTG 


ACCGCCGCCG 


AGGTCCGGGT 


TGCTGCGGCG 


GCCTACGAGA 


CGGCGTATGG 


2100 


GCTGACGGTG 


GGCCCGCCGG 


TGATGGCCGA 


GAACCGTGCT 


GAACTGATGA 


TTCTGATAGC 


2160 


GACCAACCTC 


TTGGGGCAAA 


ACACCCCGGC 


GATCGCGGTC 


AACGAGGCCG 


7VATACGGGGA 


2220 


GATGTGGGCC 


CAAGACGCCG 


CCGCGATGTT 


TGGCTACGCC 


GGCACGGCGG 


CGACGGCGAC 


2280 


CGAGGCGTTG 


CTGCCGTTCG 


AGGACGCGCC 


ACTGATCACC 


AACCCCGGCG 


GGCTCCTTGA 


2340 


GCAGGCCGTC 


GCGGTCGAGG 


AGGCCATCGA 


CACCGCCGCG 


GGGAACCAGT 


TGATGAACAA 


2400 


TGTGCCGCAA 


GCGCTGCAAC 


AACTGGCCCA 


GCCCACGAAA 


AGCATCTGGC 


CGTTCGACCA 


2460 


ACTGAGTGAA 


CTGTGGAAAG 


CCATCTCGCC 


GCATCTGTCG 


CCGCTCAGGA 


ACATCGTGTC 


2520 


GATGCTCA^^C 


AACCACGTGT 


CGATGACCA.a. 


CTCGGGTGTG 


TCGATGGGCA 


GCACCTTGCA 


^580 


CTCAATGTTG 


AAGGGGTTTG 


^TCCGGCGGC 


GGCTCAGGCC 


GTGGAAACCG 


CGGCGC7yW\ 


2640 


CGGGGTCGAG 


GCGATGAGCT 


CGCTGGGCAG 


CCAGCTGGGT 


TCGTCGCTGG 


GTTCTTCGGG 


2700 


TGTGGGGGCT 


GGCGTGGCCG 


CCAACTTGGG 


TCGi^GCGGCC 


TCGGTCGGTT 


CGTTGTCGGT 


2760 


GCCGCAGGCC 


TGGGCCGCGG 


GCAACCAGGC 


GGTGACCGCG 


GCGGCGCGGG 


CGCTGCCGCT 


28 20 


GACCAGCGTG 


ACrAGCGCCG 


■:CCA,AAC;JG': 


CCCCGGAGAC 


ATGCTGGGCG 


GGCTAGGGCT 


2880 


ggg3C7\a::tg 


ACG.AATAGCG 


GCGGCGGGTT 


CG'^;i:GGGGTT 


AGCAATGCGT 


TGCGGATGCC 


2940 


GCCGCGGG CG 


ta:gtaatgg 


GCCGTGTGCG 


CGCCGCCGGG 


TAAGGCCGAT 


CCGCACGCAA 


3000 
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TGCGGGCCCT CTATGCGGGC AGCGATC 302 7 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 



Val Val Asp Phe Giy Ala Leu Pro 
1 5 

Tyr Ala Giy Pro Gly Ser Ala 

20 

Asp Ser Val Ala Ser Asp Leu Phe 
35 AO 

Val 



Leu Met Val Ala Ala Ala Ser Pro 
65 70 

Ala Gly Gin Ala Glu Leu Thr 

85 

Ala Tyr Glu Thr Ala Tyr Gly Leu 
100 

Glu Asn Arq Ala G.lu Leu Met He 
13 5 120 

Gin Asn Thr Pro Ala 
130 

Trp Ala Gin Asp Ala Ala Ala Met 
145 150 

Thr Ala Thr Glu Ala Leu Leu Pro 

165 

Asn Pro 



Pro Glu He Asn Ser Ala Arg Met 
10 15 



Ser Ala Ala Ser Ala Phe Gin Ser 
15 



Tyr Val Ala Trp Mot Ser Val Thr 
75 80 



Thr Val Pro Pro Pro Val He Ala 
105 110 

Leu He Ala Thr Asn Leu Leu Gly 

125 



Phe Gly Tyr Ala Ala Thr Ala Ala 

155 160 

Phe Glu Asp Ala Pro Leu He Thr 
170 175 



Gly 

1 BO 



Leu Leu Glu 



Gin Ala Val Ala 
1 8 5 



Val Glu 



Glu Ala 
190 



He 



Ser Leu Val Ala Ala Ala Lys Met Trp 
25 30 



Val Trp Gly Leu Thr Thr Gly Ser Trp He Gly Ser Ser Ala Gly 
50 55 60 



Ala Ala Gin Val Arq Val Ala Ala Ala 
90 95 



He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
135 140 



Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 
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Gin Gin Leu Ala Gin Pro Thr Lys Ser He Trp Pro Phe Asp Gin Leu 
210 215 220 

Ser Glu Leu Trp Lys Ala He Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

He Val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Ala Ser Thr Leu His Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Vai Glu Thr Ala Ala Gin Asn Gly Val Gin Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Va.l Gly Ser 
305 310 315 320 

Leu Ser Val Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly Gly Leu Pro Leu Gly Gin Leu Thr Asn 
355 360 365 

Ser Gly Gly Gly Phe Gly Gly Val Ser Asn Ala Leu Arg Met Pro Pro 
370 375 380 

Arg Ala Tyr Val Met Pro Arg Val Pro Ala Ala Gly 
385 390 395 

(2) INFORMATION FOR SEQ ID NO: 107: 

{!) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1616 base pairs 
(R) TYPE: nucle.LC acid 
(C) STRANDEDNESS: single 
([') TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

CATCGGAGGG AGTGATCACi: ATGCTGTGGC ACGC?J\TGCC ACCGGAGTAA ATAC':GCACG 60 

GCTGATGGCC GGCG:GGGTC CGGCTCCAAT GCTTGC'^GCG GCCGCGGGAT GGCAGACGCT 120 

TTCGGCGGCT CTGGACGCTi: ACGCCGTCGA GTTGACi:GCG CGCCTGAACT CTCTGGGAGA 180 
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AGCCTGGACT GGAGGTGGCA GCGACAAGGC GCTTGCGCCT GCAACGCCGA TGGTGGTCTG 24 0 

GCTACAAACC GCGTCAACAC AGGCCAAGAC CCGTGCGATG CAGGCGACGG CGCAAGCCGC 300 

GGCATACACC CAGGCCATGG CCACGACGCC GTCGCTGCCG GAGATCGCCG CCAACCACAT 360 

CACCCAGGCC GTCCTTACGG CCACCAACTT CTTCGGTATC AACACGATCC CGATCGCGTT A 20 

GACCGAGATG GATTATTTCA TCCGTATGTG GAACCAGGCA GCCCTGGCAA TGGAGGTCTA 4 80 

CCAGGCCGAG ACCGCGGTTA ACACGCTTTT CGAGAAGCTC GAGCCGATGG CGTCGATCCT 54 0 

TGATCCCGGC GCGAGCCAGA GCACGACGAA CCCGATCTTC GGAATGCCCT CCCCTGGCAG 600 

CTCAACACCG GTTGGCCAGT TGCCGCCGGC GGCTACCCAG ACCCTCGGCC AACTGGGTGA bbO 

GATGAGCGGC CCGATGCAGC AGCTGACCCA GCCGCTGCAG CAGGTGACGT CGTTGTTCAG 720 

CCAGGTGGGC GGCACCGGCG GCGGCAACCC AGCCGACGAG G7VAGCCGCGC AGATGGGCCT 780 

GGTCGGCACC AGTCCGCTGT CGAACCATCC GCTGGCTGGT GGATCAGGCC CCAGCGCGGG 84 0 

CGCGGGCCTG CTGCGCGGGG AGTCGCTACC TGGCGCAGGT GGGTCGTTGA CCCGCACGCC 900 

GGTGATGTCT CAGCTGATCG AAAAGCCGGT TGCCCGCTCG GTGATGCCGG CGGCTGCTGC 960 

CGGATCGTCG GCGACGGGTG GCGCCGCTCC GGTGGGTGCG GGAGCGATGG GCCAGGGTGC 1020 

GCAATCCGGC GGCTCCACCA GGCCGGGTCT GGTCGCGCCG GCACCGCTCG CGCAGGAGCG 1080 

TGAAGAAGAC GACGAGGAGG ACTGGGACGA AGAGGACGAC TGGTGAGCTC CCGTAATGAC 114 0 

AACAGACTTC CCGGCCACCC GGGCCGGAAG ACTTGGCAAC ATTTTGGCGA GGAAGGTAAA 1200 

GAGAGAAAGT AGTCCAGCAT GGCAGAGATG AAGAGCGATG CCGCTAGCCT CGGGCAGGAG 12 60 

GCAGGTAATT TCGAGCGGAT CTCCGGCGAG CTGAAAAGCC AGATCGAGCA GGTGGAGTCG 132 0 

ACGGCAGGTT CGTTGCAGGG CCAGTGGCGC GGCGCGGCGG GGACGGCCGC CCAGGCCGCG 138 0 

GTGGTGCGGT TCCAAGAAGC AGCCAAT7\AG CAGAAGCAGG AACTCGACGA GATCTCGACG 14 4 0 

AATATTCGTC AGGCCGGCGT GCAATACTCG AGGGCCGACG AGGAGCAGCA GCAGGCGCTG lliOO 

TGCTCGCAAA i'GGGCTTCTG ACCCGCTTVAT ACGAA.AAGAA ACGGAGCAAA AAGATC»ACAG 1560 

AGCAGCAGTG GAATTTCGGG G!:;TATCGAGG CCGGGi:;CAAG GGCAATCCAG GGAAAT 1616 
(2) IKFOFMATION for SEO rn NO:108: 

(i) SF.QUENCE CHAFL^^CTFRISTIGS : 

(A) LENGTH: 4 32 Dase pairs 

(B) TYPE: nuoi'^M :: acid 
(G) STRANDEDHESS : single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

CTAGTGGATG GGACCATGGC CATTTTCTGC AGTCTCACTG CCTTCTGTGT TGACATTTTG bO 

GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA TATCGTCCGG 120 

AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TCGGCCGCCA TGACAACCTC IHO 

TCAGAGTGCG CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGA^Qi GGTCGAACTC 24 0 

GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGCGT TGCCCTATGC GAACATCCCA 300 

GTGACGTTGC CTTCGGTCGA AGCCATTGCC TGACCGGCTT CGCTGATCGT CCGCGCCAGG 3 60 

TTCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATTTTTGCTG GACACCCTGG 4 20 

TACGCCTCCG AA 4 32 
(2) INFORMATION FOR SEQ ID NO: 109: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Met Leu Trp His Ala Met Pro Pro Glu Xaa Asn Thr Ala Arg Leu Met 
1 5 10 15 

Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala Ala Ala Gly Trp Gin 
20 25 30 

Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu Thr Ala Arg 
35 4 0 4 5 

Lou Asn Ser Leu Gly Glu Ala Trp Thr Gly Gly Gly Ser Asp Lys Ala 

50 bb 60 

Leu Ala Ala Ala Thr Pro Mot Val Val Trp Leu ''Mn Thr Ala Ser Thr 
6 5 7 0 7 5 8 0 



Gin Ala Lys Thr Arg Ala Met Gin Ala Thr Ala G t.n Ala Ala Ala Tyr 

8 5 90 95 
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Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro Glu lie Ala Ala Asn 
100 105 lie 

His lie Thr Gin Ala Val Leu Thr Ala Thr Asn Phe Phe Giy lie Asn 
115 120 125 

Thr lie Pro lie Ala Leu Thr Glu Met Asp Tyi Phe lie Arg Met Trp 
130 135 140 

Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu Thr Ala Val 
145 150 155 160 

Asn Thr Leu Phe Glu Lys Leu Glu Pro Met Ala Ser lie Leu Asp Pro 
165 170 175 

Gly Ala Ser Gin Ser Thr Thr Asn Pro lie Phe Gly Met Pro Ser Pro 
180 185 190 

Gly Ser Ser Thr Pro Vai Gly Gin Leu Pro Pro Ala Ala Thr Gin Thr 
195 200 205 

Leu Gly Gin Leu Giy Giu Met Ser Giy Pro Met Gin Gin Leu Thr Gin 
210 215 220 

Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Vai Gly Giy Thr Giy 
225 230 235 240 

Gly Giy Asn Pro Ala Asp Giu Glu Ala Ala Gin Met Gly Leu Leu Gly 
245 250 255 

Thr Ser Pro Leu Ser Asn His Pro Leu Ala Gly Gly Ser Giy Pro Ser 
260 265 270 

Ala Giy Ala Gly Leu Leu Arg Ala Giu Ser Leu Pro Gly Ala Giy Gly 
275 280 285 

Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu lie Glu Lys Pro Vai 
290 295 300 

Ala Pro Ser Val Met Pro Ala Ala Ala Ala Gly Ser Ser Ala Thr Gly 
305 310 315 320 

Gly Ala Ala Pro Vai Gly Ala Giy Ala Met Gly Gin Gly Ala Gin Ser 

325 330 335 

Gly Gly Ser Thr Arg Pro Giy Leu Val Ala Pro Ala Pro Leu Ala Gin 
340 345 350 

Glu Arq Glu Glu Asp Asp Giu Asp Asp Trp Asp Glu Giu Asp Asp Trp 

'^5^) 360 3 65 



2) TNF0Rf4ATI0N 



rOR SEQ ID NO: 110: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Met Ala Glu Met Lys Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly 
15 10 15 

Asn Phe Glu Arg He Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val 
20 25 30 

Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly 
35 40 45 

Thr Ala Ala Gin Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys 
50 55 60 

Gin Lys Gin Glu Leu Asp Glu He Ser Thr Asn He Arg Gin Ala Gly 
65 70 75 80 

Val Gin Tyr Ser Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser 
85 90 95 

Gin Met Gly Phe 
100 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 396 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1H: 

GATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGACGGCAG GTTCGTTGCA 60 

GGGCCAGTGG CGCGGCGCGG CGGGGACGGC CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 12 0 

AGCAGCCAAT AAGCAGAAGO AGGAACTCGA CGAGATCTCG ACGAATATTC GTCAGGCCGG 180 

CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGCG CTGTCCTCGC A/VATGGGCTT 2-10 
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CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAACATGA CAGAGCAGCA GTGGAATTTC 300 

GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 360 

CTTGACGAGG GGAAGCAGTC CCTGACCAAG CTCGCA 396 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 
CB) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1L2: 

lie Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val Glu Ser Thr Ala 
1 5 10 15 

Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin 
20 25 30 

Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu 
35 40 45 

Leu Asp Glu lie Ser Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser 
50 55 60 

Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
65 70 75 80 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRI PTTC)N : SEQ If) NO: 113: 

GTGGATCCCG ATCCCGTGTT TCl^CTATTCT ACGC'."^.AACT': GGCGTTGCCC TAT':^.CGAACA 60 

TCCCAGTGAC GTTG-:.CTTCG GTCG7V;\GCCA TTGCCTGACC GGCTTCGCTG ATCGTCCGCG 120 

CCAGGTTCTG CAGCGCGTTG TTCAGCTCGG TAGCCGTGi^>C GTCCCATTTT TGCTGGACAC 18 0 
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CCTGGTACGC CTCCGAACCG CTACCGCCCC AGGCCGCTGC GAGCTTGGTC AGGGACTGCT 



240 



TCCCCTCGTC 7VAGGAGGGAA TGAATGGACG TGACATTTCC CTGGATTGCG CTTGCCGCGG 



300 



CCTCGATACC CGCGAAATTC CACTGCTGCT CTGTCATGTT TTTGCTCCGT TTCTTTTCGT 



360 



ATTAGCGGGT CAGAAGCCCA TTTGCGA 



38 / 



[2] INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: ?.12 base pairs 
(R) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

CGGCACGAGG ATCTCGGTTG GCCCAACGGC GCTGGCGAGG GCTCCGTTCC GGGGGCGAGC 60 

TGCGCGCCGG ATGCTTCCTC TGCCCGCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACC 120 

TTCCCGACGT TTCGTTCGGT GTCTGTGCGA TAGCGGTGAC CCCGGCGCGC ACGTCGGGAG 180 

TGTTGGGGGG CAGGCCGGGT CGGTGGTTCG GCCGGGGACG CAGACGGTCT GGACGGAACG 24 0 

GGCGGGGGTT CGCCGATTGG CATCTTTGCC CA 27 2 
(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: I'D amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY; linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Asp Pro Val Asp A] a Val He Asn Thr Thr Cys Asn Tyr Gly Gin Va.l 
1 5 10 15 

Va ] Ala Aid Leu 



20 



(2) INFORMATION FOR SEQ ID NO : 1 1 6 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:116: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
1 5 10 15 

{2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
15 10 15 

Glu Gly Arg 



(2) INE'OPMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) J.ENGTJl: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) '70P0L0GY: linear 



(XI ) SEOUEtIi:E DESCRIPTION: SEQ ID NO: 118: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pre 
1 5 10 15 

[2) INFORMATIOr] FCR SEQ ID NO : 1 1 9 : 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 
(p.) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 121: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro 
15 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 122: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 122: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Asp Pre Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
1 5 10 15 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 124: 

(.i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 
(P) TYPE: amino acid 
(C) STRA.NDEDIjESS: 
(C') TOPC'LOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Asp Pro Pro Asp Pro His Gin Xaa Asp Mot Thr Lys Gly Tyr Tyr Pro 
1 b 10 15 

Gly Gly Ar j Arg Xaa Phe 
20 

(2) INFORMATION FCR SEQ ID NO: 125: 



wo 98/16645 



PCT/US97/18214 



151 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:125: 

Asp Pro Gly Tyr Thr Pro Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
{B} TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(D) OTHER INFORMATION: /note= "The Second Residue Can Be Either 
Pro or Thr" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Xaa Xaa Gly Phe Thr Gly Pro Gin Phe Tyr 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9 amino acids 
(P) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(IX) FEATURE: 

(D) OTHER INFORMATION: /note- "The Third Residue Can Be Either 

Gin or Leu" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Xaa Pre Xaa Val Thr Ala Tyr Ala Gly 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(x.i) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Xaa Xaa Xaa Glu Lys Pro Phe Leu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:129: 

Xaa Asp Ser Glu Lys Ser Ala Thr He Lys Val Thr Asp Ala Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO:130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Ala Gly Ar^r 'Ihr Xaa lie Tyr He Val Gly Asn L(.'U Thr Ala Asp 
IS 10 15 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHAR/iCTERI STICS : 

(A) LENGTH: 15 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:131: 

Ala Pro Glu Ser Gly Ala Gly Leu Gly Gly Thr Val Gin Ala Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:132: 

Xaa Tyr He Ala Tyr Xaa Thr Thr Ala Gly He Val Pro Gly Lys He 
15 10 15 

Asn Val His Leu Val 

20 

(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 base pairs 
(E) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

GCAACGCTGT CGTGGCCTTT GCGGTGATCG GTTTCGCCTC GCTGGCGGTG GCGGTGGCGG 60 

TCACCATCCG ACCGACCGCG GCCTCAAAAC CGGTAGAGGG ACACCA.AAAC GCCCAGCCAG 120 

GGAAGTTCAT GCCGTTGTTG CCGACGCAAC AGCAGGCGCC GGTCCCGCCG CCTCCGCCCG 180 

ATGATCCCAC CGCTGGATTC CAGGGCGGCA CCATTCCGGC TGTACAGAAC GTGGTGCCGC 24 0 
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GGCCGGGTAC CTCACCCGGG GTGGGTGGGA CGCCGGCTTC GCCTGCGCCG GAAGCGCCGG 300 

CCGTGCCCGG TGTTGTGCCT GCCCCGGTGC CAATCCCGGT CCCGATCATC ATTCCCCCGT 360 

TCCCGGGTTG GCAGCCTGGA ATGCCGACCA TCCCCACCGC ACCGCCGACG ACGCCGGTGA 4 20 

CCACGTCGGC GACGACGCCG CCGACCACGC CGCCGACCAC GCCGGTGACC ACGCCGCCAA 480 

CGACGCCGCC GACGACGCCG GTGACCACGC CGCCAACGAC GCCGCCGACC ACGCCGGTGA 54 0 

CCACGCCACC AACGACCGTC GCCCCGACGA CCGTCGCCCC GACGACGGTC GCTCCGACCA 600 

CCGTCGCCCC GACGACGGTC GCTCCAGCCA CCGCCACGCC GACGACCGTC GCTCCGCAGC 660 

CGACGCAGCA GCCCACGCA.'V CAACCAACCC AACAGATGCC AACCCAGCAG CAGACCGTGG 720 

CCCCGCAGAC GGTGGCGCCG GCTCCGCAGC CGCCGTCCGG TGGCCGCMC GGCAGCGGCG 7 80 

GGGGCGACTT ATTCGGCGGG TTCTGATCAC GGTCGCGGCT TCACTACGGT CGGAGGACAT 84 0 

GGCCGGTGAT GCGGTGACGG TGGTGCTGCC CTGTCTC7VAC GA 882 
(2) INFORMATION FOR 3EQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

CCATCAACCA ACCGCTCGCG CCGCCCGCGC CGCCGGATCC GCCGTCGCCG CCACGCCCGC 60 

CGGTGCCTCC GGTGCCCCCG TTGCCGCCGT CGCCGCCGTC GCCGCCGACC GGCTGGGTGC 120 

CTAGGGCGCT GTTACCGCCC TGGTTGGCGG GGACGCCGCC GGCACCACCG GTACCGCCGA 180 

TGGCGCCGTT GCCGCCGGCG GCACCGTTGC CACCGTTGCC ACCGTTGCCA CCGTTGCCGA 24 0 

CCAGCCACCC GCCGCGACCA CCGGCACCGC CGGCGCCGCC CGCACCGCCG GCGTGCCCGT 300 

TCGTGCCCGT ACCCCCGGCA 'JCGCCGTTGC CGCCGTCACC GCCGACGGAA CTAOCGGCGG 360 

ACGCGGCCl'G CCCGCCGGOG CCGCCCGCAC CC-CCATTGi:;C ACCGCCGTCA CCGCCGGCTG 4 2 0 

GGAGTGCCGC GATTAGGG!:A CTGACCGGCG CAACCAG'IGC AAGTACTCTC GGTCACCGAG 4 80 

CACTTCCAGA CGACACCACA GCACGGGGTT GTCGGCGGAC TGGGTGAAAT GGCAGCCGAT 54 0 
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AGCGGCTAGC TGTCGGCTGC GGTCAACCTC GATCATGATG TCGAGGTGAC CGTGACCGCG 60 0 

CCCCCCGAAG GAGGCGCTGA ACTCGGCGTT GAGCCGATCG GCGATCGGTT GGGGCAGTGC 660 

CCAGGCCAAT ACGGGGATAC CGGGTGTCNA AGCCGCCGCG AGCGCAGCTT CGGTTGCGCG 7.10 

ACNGTGGTCG GGGTGGCCTG TTACGCCGTT GTCNTCGAAC ACGAGTAGCA GGTCTGCTCC 7 80 

GGCGAGGGCA TCCACCACGC GTTGCGTCAG CTCGT 815 
(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

ACCAGCCGCC GGCTGAGGTC TCAGATCAGA GAGTCTCCGG ACTCACCGGG GCGGTTCAGC 60 

CTTCTCCCAG AACAACTGCT GAAGATCCTC GCCCGCGAAA GAGGCGCTGA TTTGACGCTC 120 

TATGACCGGT TGAACGACGA GATCATCCGG CAGATTGATA TGGCACCGCT GGGCTAACAG 180 

GTGCGCAAGA TGGTGCAGCT GTATGTCTCG GACTCCGTGT CGCGGATCAG CTTTGCCGAC 24 0 

GGCCGGGTGA TCGTGTGGAG CGAGGAGCTC GGCGAGAGCC AGTATCCGAT CGAGACGCTG 300 

GACGGCATCA CGCTGTTTGG GCGGCCGACG ATGACAACGC CCTTCATCGT TGAGATGCTC 3 60 

AAGCGTGAGC GC9ACATCCA GCTCTTCACG ACCGACGGCC ACTACCAGGG CCGGATCTCA A '.10 

ACACCCGACG TGTCATACGC GCCGCGGCTC CGTCAGCAAG TTCACCGCAC CGACGATCCT 4 80 

GCGTTCTGCC TGTCGTTAAG CAAGCGGATC GTGTCGAGGA AGATCCTGAA TCAGCAGGCC 54 0 

TTGATTCGGG CACACACGTC GGGGCAAGAC GTTGCTGAGA GCATCCGCAC GATGAAGCAC 600 

TCGCTGGCCT GGGTCGATCG ATCGGGCTCC CTGGCGGAGT TGAACGGGTT CGAGGGAAAT 6 60 

GCCGCAAAGG CATACTTCAC CGCGCTGGGG CATCTCGTCC CGCAGGAGTT OGCATTCCAG 7:"!0 

GGCCGCTCGA CTGGGCCGCC GTT':^GACGCC TTCAACTCGA TGGTCAGCCT CGGCTATTCG 7 80 

CTGCTGTACA AGAAGATCAT AGGGGCGATC GAGCGTCACA GCCTGAACG': GTATATCGGl 840 

TTCCTACACC AGGATTCACG AGGGCACGCA ACGTCTCGTG GCGAATTCGG CACGAGCTCC 900 
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GCTGAAACCG CTGGCCGGCT GCTCAGTGCC CGTACGTAAT CCGCTGCGCC CAGGCCGGCC 960 

CGCCGGCCGA ATACCAGCAG ATCGGACAGC GAATTGCCGC CCAGCCGGTT GGAGCCGTGC 1020 

ATACCGCCGG CACACTCACC GGCAGCGAAC AGGCCTGGCA CCGTGGCGGC GCCGGTGTCC 1080 

GCGTCTACTT CGACACCGCC CATCACGTAG TGACACGTCG GCCCGACTTC CATTGCCTGC 114 0 

GTTCGGCACG AG 1152 
{2) INF0R^4ATI0N FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:136: 

CTCGTGCCGA TTCGGCAGGG TGTACTTGCC GGTGGTGTAN GCCGCATGAG TGCCGACGAC 60 

CAGCAATGCG GCAACAGCAC GGATCCCGGT CAACGACGCC ACCCGGTCCA CGTGGGCGAT 120 

CCGCTCGAGT CCGCCCTGGG CGGCTCTTTC CTTGGGCAGG GTCATCCGAC GTGTTTCCGC 180 

CGTGGTTTGC CGCCATTATG CCGGCGCGCC GCGTCGGGCG GCCGGTATGG CCGAANGTCG 240 

ATCAGCACAC CCGAGATACG GGTCTGTGCA AGCTTTTTGA GCGTCGGGCG GGGCAGCTTC 300 

GCCGGCAATT CTACTAGCGA GAAGTCTGGC CCGATACGGA TCTGACCGAA GTCGCTGCGG 360 

TGCAGCCCAC CCTCATTGGC GATGGCGCCG ACGATGGCGC CTGGACCGAT CTTGTGCCGC 4 20 

TTGCCGACGG CGACGCGGTA GGTGGTCAAG TCCGGTCTAC GCTTGGGCCT TTGCGGACGG 4 80 

TCCCGACGCT GGTCGCGGTT GCGCCGCGAA AGCGGCGGGT CGGGTGCCAT CAGG7VATGCC 54 0 

TCACCGCCGC GGCACTGCAC GGCCAGTGCC GCGGCGATGT CAGCCATCGG GACATCATGC 600 

TCGCGTTCAT ACTCCTCGAC CAGTCGGCGG AACAGCTCGA TTCi:CGGACC GCiZCA 655 
(2) INFORMATION FOR SEO ID NO: 137: 

{ i ) SEQUENCE CHAR^\CTERI STICS : 

(A) LENGTH: 267 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 



Asn Ala Val Val Ala Phe Ala Val 
i 5 

Ala Val Ala Val Thr He Arg Pro 
20 

Gly His Gin Asn Ala Gin Pro Gly 
35 40 

Gin Gin Gin Ala Pro Val Pro Pro 
50 55 

Gly Phe Gin Gly Gly Thr He 
65 70 

Pro Gly Thr Ser Pro Gly Val Gly 
85 

Glu Ala Pro Ala Val Pro Gly Val 
100 

Val Pro He He He Pro Pro Phe 

115 120 

Thr He Pro Thr Ala Pro Pro Thr 
130 135 

Thr Pro Pro Thr Thr Pro Pro Thr 
145 150 

Thr Pro Pro Thr Thr Pro Val Thr 
165 

Thr Pro Val Thr Thr Pro Pro Thr 
180 

Pro Thr Thr Val Ala Pro Thr Thr 

195 200 

Ala Thr Ala Thr Pro Thr Thr Val 
210 215 

Thr Gin Gin Pro Thr Gin Gin Met 
225 230 



He Gly Phe Ala Ser Leu Ala Val 
10 lb 

Thr Ala Ala Ser Lys Pro Val Glu 
25 30 

Lys Phe Met Pro Leu Leu Pro Thr 
45 

Pro Pro Pro Asp Asp Pro Thr Ala 
60 



Gly Thr Pro Ala Ser Pro Ala Pro 
90 95 

Val Pro Ala Pro Val Pro He Pro 
105 110 

Pro Gly Trp Gin Pro Gly Met Pro 
125 

Thr Pro Val Thr Thr Ser Ala Thr 
140 

Thr Pro Val Thr Thr Pro Pro Thr 
155 160 

Thr Pro Pro Thr Thr Pro Pro Thr 
170 175 

Thr Val Ala Pro Thr Thr Val Ala 
185 190 

Val Ala Pro Thr Thr Val Ala Pro 
205 

Ala Pro Gin Pro Thr Gin Gin Pro 

220 

Pro Thr Gin Gin Gin Thr Val Ala 

235 240 



Pro Ala Val Gin Asn Val Val Pro Arg 
75 80 



Pro Gin Thr Val Ala Pro Ala Pro Gin Pro Pro Ser Gly i^ly Arg Asn 
245 250 255 
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Gly Ser Gly Gly Gly Asp Leu Phe Gly Gly Phe 
260 265 

(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

lie Asn Gin Pro Leu Ala Pro Pro Ala Pro Pro Asp Pro Pro Ser Pro 
15 10 15 

Pro Arg Pro Pro Val Pro Pro Val Pro Pro Leu Pro Pro Ser Pro Pro 
20 25 30 

Ser Pro Pro Thr Gly Trp Val Pro Arg Ala Leu Leu Pro Pro Trp Leu 
35 40 45 

Ala Gly Thr Pro Pro Ala Pro Pro Val Pro Pro Met Ala Pro Leu Pro 
50 55 60 

Pro Ala Ala Pro Leu Pro Pro Leu Pro Pro Leu Pro Pro Leu Pro Thr 
65 70 75 80 

Ser His Pro Pro Arg Pro Pro Ala Pro Pro Ala Pro Pro Ala Pro Pro 
85 90 95 

Ala Cys Pro Phe Val Pro Val Pro Pro Ala Pro Pro Leu Pro Pro Ser 
100 105 110 

Pro Pro Thr Glu Leu Pro Ala Asp Ala Ala Cys Pro Pro Ala Pro Pro 
115 120 125 

Ala Pro Pro Leu Ala Pro Pro Ser Pro Pro Ala Gly Ser Ala Ala lie 
130 135 140 

Arg Ala Leu Thr Gly Ala Thr Ser Ala Ser Thr Leu Gly His Arg Ala 
145 150 155 160 

Leu Fro Asp Asp Thr Thr Ala Arg Gly Cys Arg Arg Thr Gly 
165 170 

{2} INFORMATION FOR SEQ ID NO: 139: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Gin Pro Pro Ala Glu Val Ser Asp Gin Arg Val Ser Gly Leu Thr Gly 
15 10 15 

Ala Val Gin Pro Ser Pro Arg Thr Thr Ala Glu Asp Pro Arg Pro Arg 
20 25 30 

Asn Arg Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Arg Ala Asp Ser Ala Gly Cys Thr Cys Arg Trp Cys Xaa Pro His Glu 
1 5 10 15 

Cys Arg Arg Pro Ala Met Arg Gin Gin His Gly Ser Arg Ser Thr Thr 
20 25 30 

Pro Pro Gly Pro Arg Gly Arg Ser Ala Arg Val Arg Pro Gly Arg Leu 
35 40 45 

Phe Pro Trp Ala Gly Ser Ser Asp Val Phe Pro Pro Trp Phe Ala Ala 
50 55 60 

lie Met Pro Ala Arg Arg Val Gly Arg Pro Val Trp Pro Xaa Val Asp 
65 70 75 80 

Gin His Thr Arg Asp Thr Gly Leu Cys Lys Leu Phe Glu Arg Arg Ala 
85 90 95 



Gly Gin Leu Arg Arg Gin Phe Tyr 
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100 

(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "PGR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 
GGATCCATAT GGGCCATCAT CATCATCATC ACGTGATCGA CATCATCGGG ACC 53 
(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PGR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 4 2 

(2) INFORMATION FOR SEQ ID NO: 14 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 31 base pairs 
(P) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(U) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "PGR Primer" 



(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 31 
[2] INFORMATTOIJ FOR SEQ ID NO: 144; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PGR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO:144: 
CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 31 
(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "PGR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

GGATCCAGCG CTGAGATGAA GACCGATGCC GOT 33 

[2) INFORMATION FOR SEQ ID NO: 146: 

{:.) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 base pairs 
(LM TYPE: nucleic acid 
{■■:.) STRANDEDNESS: single 
(D) TOPOLC'GY: linear 
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(li) MOLECULE TYPE: other nucleic acid 

(A) DESCRTPTTON: /ciesc = "PGR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 
GAGAGAATTC TCAGAAGCCC ATTTGCGAGG ACA 33 
(2) INFORMATION FOR SEQ ID NO: 14 7: 

(i) SFQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA (genomic) 

(VI ) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(rx) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 152.. 1273 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 120 

GCGGAAATTG AAGAGCACAG AAAGGTATGG C GTG AAA ATT CGT TTG CAT ACG 172 

Val Lys He Arq Leu His Thr 

1 5 

CTG TTG GCC GTG TTG ACC GCT GCG CCG CTG CTG CTA GCA GCG GCG GGC 220 
Leu Leu Ala Val Leu Thr Ala Ala Pro Leu Leu Leu Ala Ala Ala Gly 
10 20 

TGT GGC TCG AAA CCA CCG AGO GGT T(:G CCT GAA ACG GGC GCC GGC GCC 2 68 

Cys Gly Scr Lys Pro Pr 3 Ser Gly Sor Pro Glu Thr Gly Ala Gly Ala 
2 5 30 3!) 

GG'r ACT GTC GCG ACT AC?. CCG GCG TCG TCG CCG GTG ACG TTG GC^:^ GAG 316 
Gly Thr Val Ala Thr Thr Pro Al^ Ger Ser Pro Val Thr Lou Ala Glu 
4 0 4 ": 50 55 

ACC GGT AGC ACG CTG CTC TAG CGG CTG TTC AAC CTG TGG GGT GCG GCC 364 
Thr Gly Ser Thr Leu Leu Tyr Pro Leu Phe Asn Leu Trp Gly Pro Ala 
60 65 70 
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TTT CAC GAG AGG TAT CGG AAC GTC ACG ATC ACC GCT CAG GGC ACC GGT 41.^ 

Phe His G.lu Arg Tyr Pro Asn Val Thr lie Thr Aia Gin Gly Thr Gly 

7^ 80 85 

TCT GGT GCC GGG ATC GCG CAG GCC GCC GCC GGG ACG GTC AAC ATT GGG 4 60 

Ser Gly Ala Gly He Ala Gin Ala Ala Ala Gly Thr Val Asn lie Gly 

90 95 100 

GCC TCC GAC GCC TAT CTG TCG GAA GGT GAT ATG GCC GCG CAC AAG GGG 508 

Ala Ser Asp Ala Tyr Leu Ser Glu G.l y Asp Met Ala Ala His Lys Gly 

105 110 115 

CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT CAG CAG GTC AAC TAC AAC 556 

Leu Met Asn Tie Ala Leu Ala He Ser Ala Gin Gin Val Asn Tyr Asn 

120 125 130 135 

CTG CCC GGA GTG AGC GAG CAC CTC AAG CTG AAC GGA AAA GTC CTG GCG 604 

Leu Pro Gly Val Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 

140 145 150 

GCC ATG I'AC CAG GGC ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 652 

Ala Met Tyr Gin Gly Thr Tie Lys Thr Trp Asp Asp Pro Gin He Ala 

155 160 165 

GCG CTC AAC CCC GGC GTG AAC CTG CCC GGC ACC GCG GTA GTT CCG CTG 700 

Ala Leu Asn Pro Gly Val Asn Leu Pro Gly Thr Ala Val Val Pro Leu 

170 175 180 

CAC CGC TCC GAC GGG TCC GGT GAC ACC TTC TTG TTC ACC CAG TAC CTG 7 48 

His Arg Ser Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Gin Tyr Leu 

185 190 195 

TCC AAG CAA GAT CCC GAG GGC TGG GGC AAG TCG CCC GGC TTC GGC ACC 7 9n. 

Ser Lys Gin Asp Pro Glu Gly Trp Gly Lys Ser Pro Gly Phe Gly Thr 

200 205 210 215 

ACC GTC I'^AC TTC CCG GCG GTG CCG GGT GCG CTG GGT GAG AAC GGC AAC 8 4 4 

Thr Val Asp Pho Pro Ala Val Pro Gly Ala .l^eu Gly Glu Asn Gly Asn 

220 225 230 

GGC GGC ATG GTG ACC GGT TGC GCC GAG ACA CCG GGC TGC GTG GCC TAT 8 9.': 

Gly Gly Met Val Thr Gly Cys Ala Glu Thr Pro Gly Cys Val Ala Tyr 

235 240 245 

ATC GGC ATC AGC TTC CTC GAC CAG GCC AGT C.^J\ CGG GGA CTC GGC GAG 94'.) 

He Gly He Ser Phe Lr^u Asp Gin Ala Ser Gin Arg Gly Leu Gly Glu 

2 50 2 55 2 60 

GCf: CA.^. CTA (^Gi: AA.T AGC TCT GGC AAT TTC TTG TTG CCC GAC GCG CAA 9Bh 

Ala Gin Lou Gly Asn Ser Ser Gly Asn Phe L^^u Leu Pro Asp Ala Gin 

26'3 270 275 

AGC ATT CAG GO.: GCG GCG GCT GGC TTC GCA TCG AAA ACC CCG GCG AAC 1036 

Ser 11^-2 Gin Aia Aid Ala Ala Gly Phe Ala Ser Lys Thr Pro Ala Asn 
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280 285 290 295 

CAG GCG ATT TCG ATG ATC GAG GGG CCC GCC CCG GAC GGC TAG CCG ATC 108 4 

Gin Ala lie Ser Met lie Asp Gly Pro Ala Pro Asp Gly Tyr Pro lie 
300 305 310 

ATC AAC TAG GAG TAG GCC ATC GTC AAC AAC CGG CAA AAG GAC GCC GCC 1132 
lie Asn Tyr Glu Tyr Ala lie Val Asn Asn Arg Gin Lys Asp Ala Ala 
315 320 325 

ACC GCG CAG ACC TTG CAG GCA TTT CTG CAC TGG GCG ATC ACC GAC GGC 1180 
Thr Ala Gin Thr Leu Gin Ala Phe Leu His Trp Ala lie Thr Asp Gly 
330 335 340 

AAC AAG GCC TCG TTC CTG GAC CAG GTT CAT TTC CAG CCG CTG CCG CCC 1228 
Asn Lys Ala Ser Phe Leu Asp Gin Val His Phe Gin Pro Leu Pro Pro 
345 350 355 

GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC 12 7 3 
Ala Val Val Lys Leu Ser Asp Ala Leu He Ala Thr He Ser Ser 
360 365 370 

TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG CCATCGGGCT GCTTTGCGGA 1333 

GCATGCTGGC CCGTGCCGGT GAAGTCGGCC GCGCTGGCCC GGCCATCCGG TGGTTGGGTG 1393 

GGATAGGTGC GGTGATCCCG CTGCTTGCGC TGGTCTTGGT GCTGGTGGTG CTGGTCATCG 14 53 

AGGCGATGGG TGCGATCAGG CTCT^CGGGT TGCATTTCTT CACCGCCACC GAATGGAATC 1513 

CAGGCAACAC CTACGGCGAA ACCGTTGTCA CCGACGCGTC GGCCATCCGG TCGGCGCCTA 1573 

CTACGGGGCG TTGCCGCTGA TCGTCGGGAC GCTGGCGACC TCGGCAATCG CCCTGATCAT 1633 

CGCGGTGCCG GTCTCTGTAG GAGCGGCGCT GGTGATCGTG GAACGGCTGC CGAAACGGTT 1693 

GGCCGAGGCT GTGGGAATAG TCCTGGAATT GCTCGCCGGA ATCCCCAGCG TGGTCGTCGG 17 53 

TTTGTGGGGG :;CAATGACGT TCGGGCCGTT CATCGCTCAT CACATCGCTC CGGTGATCGC 1813 

TCACAACGCT CCCGATGTGC CGGTGCTGAA CTACTTGCGC GGCGACCCGG GCAACGGGGA 187 3 

GGGCATGTTG GTGTCCGGTC TGGTGTTGGC GGTGATGGTC GTTCCCATTA TCGCCACCAC 1933 

CACTCATGAC CTGTTCCGGC AGGTGCCGGT GTTGCCCCGG GAGGGCGCGA TCGGGAATTC 1993 



(2) INFORMATION FOR SEO TO NO : 1 4 8 : 

(i) SKQUENCF. CHARACTERISTICS: 

(A; LENGTH: 37 4 amino acids 
(R) TYPE: amino acid 
(D) TOPOLOGY: linear 



wo 98/16645 



PCT/US97/18214 



165 



(ii) MOLECULE TYPE: protein 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

Val Lys lie Arg Leu His Thr I,eu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Vai Thr Leu Ala Glu Thr (:^ly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 80 

He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arq Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu GJ y Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Fro Gly Cys Val Ala Tyr He Gly He Scr Phe Leu Asp Gin Ala 
245 250 255 



Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 
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Phe Leu Leu Pro Asp Ala Gin Ser He Gin Ala Ala Ala Ala Gly Phe 
21b 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 

325 330 335 

His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 

He Ala Thr He Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 14 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 120 

GCGGAAATTG AAGAGCACAG AAAGGTATGG CGTGAAAATT CGTTTGCATA CGCTGTTGGC 180 

CGTGTTGACC GCTGCGCCGC TGCTGCTAGC AGCGGCGGGC TGTGGCTCGA AACCACCGAG 24 0 

CGGTTCGCCT GAAACGGGCG CCGGCGCCGG TACTGTCGCG ACTACCCCCG CGTCGTCGCC 300 

GGTGACGTTG GCGGAGACCG GTAGCACGCT GCTCTACCCG CTGTTCAACC TGTGGGGTCC 360 

GGCCTTTCAC GAGAGi^.TATC CGAACGTCAC GATCACCGCT CAGGGCACCG GTTCTGGTGC 4 20 

CGGGATCGCG CAGGCC:;CCG CCGi^GACGGT CAACATTGGG GCCTCCGACG CCTATCTGTC 4 80 

GG^^AGGTGAT ATGGCC:;CGC ACAAGG3GCT GATGAACATC GCGCTAGCCA TCTCCGCTCA 54 0 

GCAGGTCAAC TACAACCTGC CCGGAGTGAG CGAGCACCTC AAGCTGAACG G/yVAAGTCCT 60 0 
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GGCGGCCATG TACCAGGGCA CCATCAAAAC CTGGGACGAC CCGCAGATCG CTGCGCTCAA 6^50 

CCCCGGCGTG AACCTGCCCG GCACCGCGGT AGTTCCGCTG CACCGCTCCG ACGGGTCCGG 7.":0 

TGACACCTTC TTGTTCACCC AGTACCTGTC CAAGCAAGAT CCCGAGGGCT GGGGCAAGTC 780 

GCCCGGCTTC GGCACCACCG TCGACTTCCC GGCGGTGCCG GGTGCGCTGG GTGAGAACGG 840 

CAACGGCGGC ATGGTGACCG GTTGCGCCGA GACACCGGGC TGCGTGGCCT ATATCGGCAT 900 

CAGCTTCCTC GACCAGGCCA GTCAACGGGG ACTCGGCGAG GCCCAACTAG GCAATAGCTC 960 

TGGCAATTTC TTGTTGCCCG ACGCGCAAAG CATTCAGGCC GCGGCGGCTG GCTTCGCATC 10:^0 

GAAAACCCCG GCGAACCAGG CGATTTCGAT GATCGACGGG CCCGCCCCGG ACGGCTACCC 1080 

GATCATCAAC TACGAGTACG CCATCGTCAA CAACCGGCAA AAGGACGCCG CCACCGCGCA IJAO 

GACCTTGCAG GCATTTCTGC ACTGGGCGAT CACCGACGGC AACAAGGCCT CGTTCCTCGA 12 00 

CCAGGTTCAT TTCCAGCCGC TGCCGCCCGC GGTGGTGAAG TTGTCTGACG CGTTGATCGC 12 60 

GACGATTTCC AGCTAGCCTC GTTGACCACC ACGCGACAGC AACCTCCGTC GGGCCATCGG 1320 

GCTGCTTTGC GGAGCATGCT GGCCCGTGCC GGTGAAGTCG GCCGCGCTGG CCCGGCCATC 13ft0 

CGGTGGTTGG GTGGGATAGG TGCGGTGATC CCGCTGCTTG CGGTGGTCTT GGTGCTGGTG 14^0 

GTGCTGGTGA TCGAGGCGAT GGGTGCGATC AGGCTCAACG GGTTGCATTT CTTCACCGCC 1500 

ACCGAATGGA ATCCAGGCAA CACCTACGGC GAAACCGTTG TCACCGACGC GTCGCCCATC 1560 

CGGTCGGCGC CTACTACGGG GCGTTGCCGC TGATCGTCGG GAGGCTGGCG ACCTCGGCAA 162 0 

TCGCCCTGAT CATCGCGGTG CCGGTCTCTG TAGGAGCGGC GCTGGTGATC GTGGAACGGC 1680 

TGCCGAAACG GTTGGCCGAG GCTGTGGGAA TAGTCCTGGA ATTGCTCGCC GGAATCCCCA 17 4 0 

GCGTGGT:GT CGGTTTGTGG G'^GGGAATGA CGTTCGGGGC GTTCATCGCT CATCACATCG 18 00 

CTCCGGTGAT CGCTCACAAC GCTCCCGATG TGCCGGTGCT GAACTACTTG CGCGGCGACC 18 60 

CGGGCAAGGG GGAGGGCATG TTGGTGTCCG GTCTGGTGTT GGCGGTGATG GTCGTTCCCA 192 0 

TTATCGCCAC CACCACTGAT GACGTt^TTCC GGCAGGTGCC GGTGTTGCCC CGGGAGGGCG 1980 

CGATCGG-.^A.A TTC 19^r^. 
(2) INF'")R^4ATI0N FOR SEQ ID NO: 150: 

(i) SEOUENCr. ■:HAP/\CTER1STICS : 

(A) LENGTIl: 374 amino acids 

(B) TYPr. : ammo acid 
(0) STP.ANDEDNESS: 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Met Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 4 0 4 5 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 80 

He Thr Ala Gin Gly Thr Gly Ser Gly Ala GJ y He Ala Gin Ala Ala 

85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 110 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 



Leu Asn Gly Lys Val Leu Ala Ala 
145 150 

Trp Asp Asp Pro Gin He Ala Ala 
165 

Gly Thr Ala Val Val Pro Leu His 
180 



Met Tyr Gin Gly Thr He Lys Thr 

155 160 

Leu Asr Pro G.l y Val Asn Leu Pro 
170 175 

Arg Ser Asp Gly Ser Gly Asp Thr 
185 190 



Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 

195 200 205 

Lys Sor Pro Gly Pho Gly Thr Thr Val Asp Phe Pro Ala Val Fro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly GJy Met Val Thr GJ y Cys Ala Glu 
225 230 235 2^0 

Thr Pro G.l y Cys Val Ala Tyr He Gly lio Ser Phe Leu Asp Gin Ala 
245 250 255 
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Ser Gin Arg Gly Leu Gly Glu Aia Gin Leu Gly Asn Ser Ser Giy Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Scr lie Gin Ala Aia Ala Ala Gly Phe 
275 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala lie Ser Met He Asp Gly Pro 
290 29S 300 

Ala Pro Asp Gly Tyr Pro lie He Asn Tyr Glu Tyr Ala He Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Trp Ala He Thr Asp Gly Asn Lys Aia Ser Phe Leu Asp Gin Val 
3A0 345 350 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Scr Asp Ala Leu 
355 360 365 

He Ala Thr He Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0H51: 

GGTCTTGACC ACCACCTGGG TGTCGAAGTi: GGTGCCCGGA TTGAAGTCCA GGTACTCGTG 60 

GGTGGGGCGG GCGAAACAAT AGCGACAAGC ATGCGAGCAG CCGCGGTAGC CGTTGACGGT l.::0 

GTAGCGAAAC GGCAAi:GCGG CCGCGTTGGG CACCTTGTTC AGCGCTGATT TGCACAACAC IHO 

CTCGTGGAAG ::^.TGATGCCGT CGAATTGTGG CGCGCGAACG CTGCGGACGA GGCCGATCCG 2 40 

CTGCA.ACCCG GCAG^GCCC':; TCGTCAACGG GCATCCCGTT CACCGCGACG GCTTGCCGGG 300 

CCC/IACGCAT ;v:CArrATT':: GAAG.AACi:GT TCTATACTTT GT'^AACGGTG GCC^CTACCG 3^0 

agccccgcac aggatgtgat atgccaT':tc tgi:ci:gcaca gagaggagcc aggccttatg i:'0 

ACAGCATTCG G:GTCGAGCC CTACGGGCAG CCGAAGTACC TAGAAATCGiJ CGGG/\AGCGG 4 80 

ATGGCGTATA TCGACGAAi^G CAAGGGTGAC GCCATCGTCT TTCAGCACGG CAACCCCACG 5<3 0 
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TCGTCTTACT TGTGGCGCAA CATCATGCCG CACTTGGAAG GGCTGGGCCG GCTGGTCGCC 600 
TGCGATCTGA TCGGGATGGG CGCGTCGGAC AAGCTCAGCC CATCGGGACC CGACCGCTAT 6 60 

AGCTATGGCG AGCAACGAGA CTTTTTGTTC GCGCTCTGGG ATGCGCTCGA CCTCGGCGAC 720 
CACGTGGTAC TGGTGCTGCA CGACTGGGGC TCGGCGCTCG GCTTCGACTG GGCTAACCAG 780 
CATCGGGACC GAGTGCAGGG GATCGCGTTC ATGGAAGCGA TCGTCACCCC GATGACGTGG 840 
GCGGACTGGC CGCCGGCCGT GCGGGGTGTG TTCCAGGGTT TCCGATCGCC TCAAGGCGAG 900 
CCAATGGCGT TGGAGCACAA CATCTTTGTC GAACGGGTGC TGCCCGGGGC GATCCTGCGA 9 60 

CAGCTCAGCG ACGAGGAAAT GAACCACTAT CGGCGGCCAT TCGTGAACGG CGGCGAGGAC 1020 

CGTCGCCCCA CGTTGTCGTG GCCACGAAAC CTTCCAATCG ACGGTGAGCC CGCCGAGGTC 1080 

GTCGCGTTGG TCAACGAGTA CCGGAGCTGG CTCGAGGAAA CCGACATGCC GAAACTGTTC 114 0 

ATCAACGCCG AGCCCGGCGC GATCATCACC GGCCGCATCC GTGACTATGT CAGGAGCTGG 1200 

CCCAACCAGA CCG/VAATCAC AGTGCCCGGC GTGCATTTCG TTCAGGAGGA CAGCGATGGC 12 60 

GTCGTATCGT GGGCGGGCGC TCGGCAGCAT CGGCGACCTG GGAGCGCTCT CATTTCACGA 1320 

GACCAAGAAT GTGATTTCCG GCGAAGGCGG CGCCCTGCTT GTCAACTCAT AAGACTTCCT 13 80 

GCTCCGGGCA GAGATTCTCA GGGAAAAGGG CACCAATCGC AGCCGCTTCC TTCGCAACGA 14 40 

GGTCGACAAA TATACGTGGC AGGACAAAGG TCTTCCTATT TGCCCAGCGA ATTAGTCGCT 1500 

GCCTTTCTAT GGGCTCAGTT CGAGGAAGCC GAGCGGATCA CGCGTATCCG ATTGGACCTA 15 60 

TGGAACCGGT ATCATGAAAG CTTCGAATCA TTGGAACAGC GGGGGCTCCT GCGCCGTCCG 1620 

ATCATCCCAC AGGGCTGCTC TGACAACGCC CACATGTACT ACGTGTTACT AGCGCCCAGC 1680 

GCCGATCGGG AGGAGGTGCT GGCGCGTCTG ACGAGCGAJ\G GTATAGGCGC GGTCTTTCAT 17 4 0 

TACGTGCCGC TTCACGATTC GCCGGCCGGG CGTCGCT 1777 
(2) INFORMATION FOR SEQ ID NO: 152: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32-1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANljEDNEf..^ : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

GAGATTGAAT CGTACCGGTC TCCTTAGCGG CTCCGTCCCG TGAATGCCCA TATCACGCAC 60 

GGCCATGTTC TGGCTGTCGA CCTTCGCCCC ATGCCCGGAC GTTGGTAAAC CCAGGGTTTG 120 

ATCAGTAATT CCGGGGGACG GTTGCGGGAA GGCGGCCAGG ATGTGCGTGA GCCGCGGCGC 180 

CGCCGTCGCC CAGGCGACCG CTGGATGCTC AGCCCCGGTG CGGCGACGTA GCCAGCGTTT 24 0 

GGCGCGTGTC GTCCACAGTG GTACTCCGGT GACGACGCGG CGCGGTGCCT GGGTGAAGAC 300 

CGTGACCGAC GCCGCCGATT CAGA 32 4 
(2) INFORMATION POP. SEQ ID N0:lb3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TCiPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

GCGGTACCGC CGCGTTGCGC TGGCACGGGA CCTGTACGAC CTGAACCACT TCGCCTCGCG 60 

AACGATTGAC GAACCGCTCG TGCGGCGGCT GTGGGTGCTC AAGGTGTGGG GTGATGTCGT 120 

CGATGACCGG CGCGGCACCC GGCCACTACG CGTCGAAGAC GTCCTCGCCG CCCGCAGCGA IRO 

GCACGACTTC CAGCCCGACT CGATCi:5GCGT GCTGACCCGT CCTGTCGCTA TGGCTGCCTG 24 0 

GGAAGCTCGC GTTCGGAAGC GATTTGCGTT CCTCACTGAC CTCi^ACGCCG ACGAGCAGCG 300 

GTGGGCCGCC TGCGACGMC GGCACCGCCG CGAAGTGGAG AACGCGCTGG CGGTGCTGCG 3b0 

GTCCTGATCA ACCTGCCGGC GATCGTGCCG TTCCGCTGGC ACGGTTGCGG CTGGACGCGG 4 20 

CTGAATCGAC TAGATGAGAG CAGTTGGGCA CGAATCCGGC TGTGGTGGTG AGCAAGACAC 4R0 

GAGTACTGTC ATCACTATTG GATG'':ACTGG ATGACCGGCC TGATTi:AGCA GGACCAATGG 54 0 

AACTGCCCGG GGCA/^AACGT CTCGGAGATG ATGGGCGTCC CCTCGGAACC CTGCGGTGCT 60 0 

GGCGTCATTC GGACAT:Gi:/r '.:CGijCTCGCG GGArCGIGGT GAC'';CCAGC'':^ CTGA^'\GGAGT 660 

GL-iAGCGGGGC GGTGCArrCG CTGCTG':;ACG GCCG3CAGAC GGTGGTGCTG CGTAA:;GGCG 720 

GGATCGGCGA 3AAG:g:TT: i3AGGTG:;CGG CC:a:GAGTT CTT':^TTGrTC i:CGAi::GGTCG 780 

cgcacagcca cg:'cgagcgg GTTCGCCCCG AGCACCGCGA CCTGCTGGGC CCGGCGGCCG 84 0 
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CCGACAGCAC CGACGAGTGT GTGCTACTGC GGGCCGCAGC GAAAGTTGTT GCCGCACTGC 900 

CGGTTAACCG GCCAGAGGGT CTGGACGCCA TCGAGGATCT GCACATCTGG ACCGCCGAGT 960 

CGGTGCGCGC CGACCGGCTC GACTTTCGGC CCAAGCACAA ACTGGCCGTC TTGGTGGTCT 1020 

CGGCGATCCC GCTGGCCGAG CCGGTCCGGC TGGCGCGTAG GCCCGAGTAC GGCGGTTGCA 1080 

CCAGCTGGGT GCAGCTGCCG GTGACGCCGA CGTTGGCGGC GCCGGTGCAC GACGAGGCCG 114 0 

CGCTGGCCGA GGTCGCCGCC CGGGTCCGCG AGGCCGTGGG TTGACTGGGC GGCATCGCTT 1200 

GGGTCTGAGC TGTACGCCCA GTCGGCGCTG CGAGTGATCT GCTGTCGGTT CGGTCCCTGC 1260 

TGGCGTCAAT TGACGGCGCG GGCAACAGCA GCATTGGCGG CGCCATCCTC CGCGCGGCCG 1320 

GCGCCCACCG CTACAACC 1338 
(2) INFORMATION FOR 3EQ ID NO: 154: 

(i) SEOUFNCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 
{'?) STRANDEDNESS: single 
{D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

CCGGCGGCAC CGGCGGCACC GGCGGTACCG GCGGCAACGG CGCTGACGCC GCTGCTGTGG 60 

TGGGCTTCGG CGCGAACGGC GACCCTGGCT TCGCTGGCGG CAAAGGCGGT AACGGCGGAA 12 0 

TAGGTGGGGC OGCGGTGACA GGCGGGGTCG CCGGCGACGG CGGCACCGGC GGCAAAGGTG 180 

GCACCGGCGG TGCCGGCGGC GCCGGCAACG ACGCCGGCAG CACCGGCAAT CCCGGCGGTA 24 0 

AGGGCGGCGA GGGCGGGATC GGCGGTGCCG GCGGGGCCGG CGGCGCGGCG GGCACCGGCA 30 0 

ACGGCGGCCA TGCCGGCAAC C 321 

(2) INFORMATION FOR 3FQ ID NO: 155: 

(i) SFjUENCE CHAFACTERISTICS : 

{\) Li:'NGTH: 4 92 base pairs 
(p.) TYPE: nucleic acid 
(■:) STPJ^NDEDNESS : smqlo 
(D) T0P0LO:^Y: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

GAAGACCCGG CCCCGCCATA TCGATCGGCT CGCCGACTAC TTTCGCCGAA CGTGCACGCG 60 

GCGGCGTCGG GCTGATCATC ACCGGTGGCT ACGCGCCCAA CCGCACCGGA TGGCTGCTGC 120 

CGTTCGCCTC CGAACTCGTC ACTTCGGCGC AAGCCCGACG GCACCGCCGA ATCACCAGGG 180 

CGGTCCACGA TTCGGGTGCA AAGATCCTGC TGCAAATCCT GCACGCCGGA CGCTACGCCT 24 0 

ACCACCCACT TGCGGTCAGC GCCTCGCCGA TCAAGGCGCC GATCACCCCG TTTCGTCCGC 300 

GAGCACTATC GGCTCGCGGG GTCGAAGCGA CCATCGCGGA TTTCGCCCGC TGCGCGCAGT 360 

TGGCCCGCGA TGCCGGCTAC GACGGCGTCG AAATCATGGG CAGCGAAGGG TATCTGCTCA 4 20 

ATCAGTTCCT GGCGCCGCGC ACCAACAAGC GCACCGACTC GTGGGGCGGC ACACCGGCCA 4 80 

ACCGTCGCCG GT 4 92 
(2) INEORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Phe Ala Gin His Leu Val Glu Gly A3p Ala Val Glu Leu Trp Arg Ala 
15 10 15 

Asn Ala Ala Asp Gin Ala Asp Pro Leu Gin Pro Gly Ser Ala Arg Arq 
20 25 30 

Gin Arq Ala Ser Arq Ser Pro Arg Arg Leu Ala Gly Pro Asn Ala Tyr 
35 40 45 

His Tyr Ser Asn Asn Arg Ser lie Leu Cys Gin Arq Trp Pro Leu Pro 

50 55 60 

Ser Ala Ala Gin Pisp Val lie Cys His Leu Cys Pro H.i.s Arq i^ln Glu 

65 7 0 7 5 80 

Pro Gly Leu Met Thr Ala Phe Gly Val GJu Pro Tyr Gly Gin Pro Lys 
8 5 90 95 



Tyr Leu Glu He Ala Gly Lys Arg Met. A.i a Tyr Tie Asp Glu {:^ly Lys 
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100 105 110 

Gly Asp Ala lie Val Phe Gin His Gly Asn Pro Thr Ser Ser Tyr Leu 
115 120 125 

Trp Arg Asn lie Met Pro His Leu Glu Gly Leu Gly Arg Leu Val Ala 
130 135 140 

Cys Asp Leu He Gly Met Gly Ala Ser Asp Lys Leu Ser Pro Ser Gly 
145 150 155 160 

Pro Asp Arg Tyr Ser Tyr Gly Glu Gin Arg Asp Phe Leu Phe Ala Leu 
165 170 175 

Trp Asp Ala Leu Asp Leu Gly Asp His Val Val Leu Val Leu His Asp 
180 185 190 

Trp Gly Ser Ala Leu Gly Phe Asp Trp Ala Asn Gin His Arg Asp Arq 

195 200 205 

Val Gin Gly He Ala Phe Met Glu Ala He Val Thr Pro Met Thr Trp 
210 215 220 

Ala Asp Trp Pro Pro Ala Val Arg Gly Val Phe Gin Gly Phe Arg Ser 
225 230 235 240 

Pro Gin Gly Glu Pro Met Ala Leu Glu His Asn He Phe Val Glu Arg 
245 250 255 

Val Leu Pro Gly Ala He Leu Arg Gin Leu Ser Asp Glu Glu Met Asn 
260 265 270 



His Tyr Arg Arg Pro Phe Val Asn 
275 280 

Leu Ser Trp Pro Arg Asn Leu Pro 
290 295 

Val 
30.^ 

Pro Lys Leu Phe lie Asn Ala Glu 
325 

He Arq Asp 



Prct G j y Va 1 
355 

Ala Gly Ala Arq Gin His Arg Arg 
370 375 



Gly Gly Glu Asp Arg Arg Pro Thr 
285 

He Asp Gly Glu Pro Ala Glu Val 
300 



Pro Gly Ala He He Thr Gly Arg 
330 335 



Pro Giy Ser Ala Leu He Ser Arq 
380 



Ala Leu Val Asn Glu Tyr Arg Ser Trp Leu Glu Glu Thr Asp Met 
310 315 320 



Tyr Val Arg Ser Trp Pro Asn Gin Thr Glu He Thr Val 
340 345 350 

His Phe Val Gin Glu Asp Ser Asp Gly Val Val Ser Trp 
360 365 



AsfT) Gin Glu Cys Asp Phe Arq Arg Arg Arq Arq Pro Ala Cys Gin Leu 

395 390 395 400 



wo 98/16645 



175 



PCT/US97/18214 



lie Arq T.eu Pro Ala Pro Gly Arg Asp Ser Gin Giy T.ys Gly His Gin 
405 410 415 

Ser Gin Pro Leu Pro Ser Gin Arg Gly Arg Gin lie Tyr Val Ala Gly 
420 425 430 

Gin Arg Scr Scr Tyr Leu Pro Ser Glu Leu Val Aia Ala Phe Leu Trp 
435 440 445 

Ala Gin Phe Glu Glu Ala Glu Arg lie Thr Arg Tie Arg Leu Asp Leu 
450 455 460 

Trp Asn Arg Tyr His Glu Ser Phe Glu Ser Leu Glu Gin Arg Gly Leu 
465 470 475 480 

Leu Arg Arg Pro lie He Pro Gin Gly Cys Ser His Asn Ala His Met 
485 490 495 

Tyr Tyr Val Leu Leu Ala Pro Ser Ala Asp Arg Glu Glu Val Leu Ala 
500 505 510 

Arg Leu Thr Ser Glu Gly He Gly Ala Val Phe His Tyr Val Pro Leu 

515 520 525 

His Asp Ser Pro Ala Gly Arg Arg 
530 535 

(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 284 amino acids 

(B) TYPE: amino acid 

(C) STRANDKDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

Asn Glu Ser Ala Pro Arg Ser Pro Met Leu Pro Ser Ala Arg Pro Arg 
15 10 15 

Tyr Asp Ala tie Aia Val Leu Leu Asn Glu MeL His Ala Gly His Cys 

20 25 30 

Asp Phe Gly Leu Val Gly Pro Ala Pro Asp lie Val Thr Asp Ala Ala 

3 5 4 0 4 5 

Gly Asp Asp Arg Ala Gly Lou Gly Val Asp Glu Gin Phe Arg His Val 

50 55 60 

Gly Phe Leu Glu Pro Ala Pro Val Leu Val Asp Gin Arg Asp Asp Leu 
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65 70 75 80 

Gly Gly Leu Thr Val Asp Trp Lys Val Ser Trp Pro Arq Gin Arg Gly 
05 90 95 

Ala Thr Val Leu Ala Ala Val His Glu Trp Pro Pro lie Val Val His 
100 105 110 

Phe Leu Val Ala Glu Leu Ser Gin Asp Arg Pro Gly Gin His Pro Phe 
115 120 125 

Asp Lys Asp Val Val Leu Gin Arg His Trp Leu Ala Leu Arg Arg Ser 
130 135 140 

Glu Thr Leu Glu His Thr Pro His Gly Arg Arg Pro Val Arg Pro Arg 
145 150 155 160 

His Arg Gly Asp Asp Arg Phe His Glu Arg Asp Pro Leu His Ser Val 
165 170 175 

Ala Met Leu Val Ser Pro Val Glu Ala Glu Arg Arg Ala Pro Val Val 
180 185 190 

Gin His Gin Tyr His Val Val Ala Glu Val Glu Arg He Pro Glu Arg 
195 200 205 

Glu Gin Lys Val Ser Leu Leu Ala He Ala He Ala Val Gly Ser Arg 
210 215 220 

Trp Ala Glu Leu Val Arg Arg Ala His Pro Asp Gin He Ala Gly His 
225 230 235 240 

Gin Pro Ala Gin Pro Phe Gin Val Arg His Asp Val Ala Pro Gin Val 
245 250 255 

Arg Arq Arg Gly Val Ala Val Leu Lys Asp Asp Gly Val Thr Leu Ala 
260 265 270 

Phe Val Asp He Arg Hxs Ala Leu Pro Gly Asp Phe 
275 280 

(2) INF0R^4ATI0N FOR SEQ ID NOH58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 264 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRTPTJON: SEQ ID NO: 158: 
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ATGAACATiJT CGTCGGTGGT GGGTCGCAAG GCCTTTGCGC GATTCGCCGG CTACTCCTCC 60 

GCCATGCACG CGATCGCCGG TTTCTCCGAT GCGTTGCGCC MGAGCTGCG GGGTAGCGGA 1.^0 

ATCGCCGTCT CGGTGATCCA CCCGGCGCTG ACCCAGACAC CGCTGTTGGC CAACGTCGAC 18 0 

CCCGCCGACA TGCCGCCGCC GTTTCGCAGC CTCACGCCCA TTCCCGTTCA CTGGGTCGCG 24 0 

GCAGCGGTGC TTGACGGTGT GGCG 2 64 
(2) INFORI^TION FOR SEQ TD NO: 159: 

(1) 3EQUE:?^CE CHARACTERISTICS: 

(A) LENGTH: 1171 base pairs 

(B) TYPE: nucleic acid 

(C) 3TRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

TAGTCGGCGA CGATGACGTC GCGGTCCAGG CCGACCGCTT CAAGCACCAG CGCGACCACG 60 

AAGCCGGTGC GATCCTTACC CGCGAAGCAG TGGGTGAGCA CCGGGCGTCC GGCGGCAAGC 120 

AGTGTGACGA CACGATGTAG CGCGCGCTGT GCTCCATTGC GCGTTGGGAA TTGGCGATAC 180 

TCGTCGGTCA TGTAGCGGGT GGCCGCGTCA TTTATCGACT GGCTGGATTC GCCGGACTCG 24 0 

CCGTTGGACC CGTCATTGGT TAGCAGCCTC TTGAATGCGG TTTCGTGCGG CGCTGAGTCG 300 

TCGGCGTCAT CATCGGCGAG GTCGGGGAAC GGCAGCAGGT GGACGTCGAT GCCGTCCGGA 360 

ACCCGTCCTG GACCGCGGCG GGCAACCTCC CGGGACGACC ::;CAGGTCGGC AACGTCGGTG 4 20 

ATCCCCAGCC GGCGCAGCGT TGGCCCTCGT GCCGAATTCG GGACGAGGCT GGCGAGCCAC 4 HO 

CGGGCATCAC CA.AGCAACGC TTGCCCAGTA CGGAT(:GTCA CTTCCGCATC CGGCAGACCA 54 0 

atctcct':g: ':gi::ccatcgt cA'^atcccgc tcgtgcgttg acaagaacgg ccgca'^atgt 6oo 

gccagcgggt atgggagatt gaaccgcgca cgcagttct 1 caatcgctgc gcgctgccgc 6fi0 

actattgg:a :tttccggcg gtcgcggtat tcagcaagca tgcgagtctc gacgaactcg 7.:o 

CCCCACGTAA :::ACGGCGT AGCTCGCGGC GTGACGCGGA GGATCGGCGG GTGATCTTTG T^iO 

ccgccacg:t ::;rAGCCGTT gatcgaccgc ttcgcggt-^g c::gcggggag ggggati:agc &4 0 

ttatcgac:t ::7G-:gtatgc ggac ::;gi:aa':; ctgggcgcgt tcgtcgaggf gaagaai:tcc ^ifio 

ACCATCGGCA : :ggcacgaa ggtgcggcac ctgacctagc^ tcggcgacgc CGAGATCGGC ^'>0 
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GAGTACAGCA ACATCGGCGC CTCCAGCGTG TTCGTCAACT ACGACGGTAC GTCCAAACGG 102 0 
CGCACCACCG TCGGTTCGCA CGTACGGACC GGGTCCGACA CCATGTTCGT GGCCCCAGTA 108 0 
ACCATCGGCG ACGGCGCGTA TACCGGGGCC GGCACAGTGG TGCGGGAGGA TGTCCCGCCG 114 0 

(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 60 

ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGCGCCGGC GGCACCAGCT 12 0 

TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 180 

GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGCCGCC 227 
(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

CCTCGCCACC ATGGGCGGGC AGGGCGGTAG CGGTGGCGCC GGCTCTACCC CAGGCGCCAA 60 

GGGCGCCCAC GG'L'TTCACTC CAACCAGCGG CGGCGACGGC 3GCGACGGCG GCAACGGCGG 12 0 

CAACTCCCAA GTGGTCGGCG GCAACGGCGG CGACGGCGGC MTGGCGGCA ACGGCGGCAG 16 0 

CGCCGGCACG GGCGGCAACG GCGGCCGCCLi CGGCGACGGC i:;CGTTTGGTG GCATGAGTGC 24 0 

CAACGCCACC AACCCTGGTG TUW^CGGGCC AAACGGTAAC CCCGGCGGCA ACGGTGGCGC 30 0 
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CGGC 30 4 

(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE3S: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

GTGGGACGCT GCCGAGGCTG TATAACAAGG ACAACATCGA CCAGCGCCGG CTCGGTGAGC 60 

TGATCGACCT ATTTAACAGT GCGCGCTTCA GCCGGCAGC.G CGAGCACCGC GCCCGGGATC 120 

TGATGGGTGA GGTCTACGAA TACTTCCTCG GCAATTTCGC TCGCGCGGAA GGGAAGCGGG 180 

GTGGCGAGTT CTTTACCCCG CCCAGCGTGG TCAAGGTGAT CGTGGAGGTG CTGGAGCCGT 24 0 

CGAGTGGGCG GGTGTATGAC CCGTGCTGCG GTTCCGGAGG CATGTTTGTG CAGACCGAGA 300 

AGTTCATCTA CGAACACGAC GGCGATCCGA AGGATGTCTC GATCTATGGC CAGGAAAGCA 360 

TTGAGGAGAC CTGGCGGATG GCGAAGATGA ACCTCGCCAT CCACGGCATC GACAACAAGG 4 20 

GGCTCGGCGC CCGATGGAGT GATACCTTCG CCCGCGACCA GCACCCGGAC GTGCAGATGG 4 80 

ACTACGTGAT GGCCAATCCG CCGTTCAACA TCAAAGACTG GGCCCGCAAC GAGGAAGACC .^>4 0 

CACGCTGGCG CTTCGGTGTT CCGCCCGCCA ATAACGCCAA CTACGCATGG ATTCAGCACA 600 

TCCTGTACAA CTTGGCGCCG GGAGGTCGGG CGGGCGTGGT GATGGCCAAC GGGTCGATGT 661) 

CGTCG7VACTC C.AACGGCA/^.G GGGGATATTC GCGCGCAAAT CGTGGAGGCG GATTTGGTTT "7 2 0 

CCTGCATGGT CGCGTTACCC ACCCAGCTGT TCCGCAGCAC CGGAATCCCG GTGTGCCTGT 7 00 

GGTTTTTCGC CAAAAAC7VAG GCGGCAGGTA AGCAAt:^GGl'C TATCAACCGG TGCGGGCAGG 8 41) 

TGCTGTT'JAT CGACGCTCGT GAACTGGGCG ACCTAGTGGA CCGGGCCGAG CGGGCGCTGA '.^(Mi 

CC.^CGAGGA GATCGTCCGC ATCGGGGATA CCTTCCACGC GAGCACGACC ACCGGCAACG 9*)') 

GCGGCTCCGG TGGTGCCGGC GGTAATGGGG GCACTGGCCT GAA2Gi'^.Ci:^Ci:^ GGCGGTGCTG 1U.''I 

GCGGGGCCGG ':GGCAACGCi; Gl^TCTCGCCG i^^CGTGTGGTT CGGCAACGCT GTG ^r.CGGCG 10>3't 

ACGGCGGCAA GGG<:GGC7\Ai: GGCGGCCACG CCGGCGAG .-C CAi:GACGGGC GGCGCCGGCG ] 1 ■) i ] 

GCAAGGGCGG CAACGGCAGG hG^:GGTGCCG CCAGOZGCTC AGGCGTCGTC AA'::GTCACCG 12iJi:i 
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CCGGCCACGG CGGCAACGGC GGCAATGGCG GCAACGGCGG CAACGGCTCC GCGGGCGCCG 12b0 

GCGGCCAGGG CGGTGCCGGC GGCAGCGCCG GCAACGGCGG CCACGGCGGC GGTGCCACCG 1320 

GCGGCGCCAG CGGCAAGGGC GGCAACGGCA CCAGCGGTGC CGCCAGCGGC TCAGGCGTCA 1380 

TCAACGTCAC CGCCGGCCAC GGCGGC7UVCG GCGGCAATGG CGGCAACGGC GGCAACGGC 14 39 
(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

GGGCCGGCGG GGCCGGATTT TCTCGTGCCT TGATTGTCGC TGGGGATAAC GGCGGTGATG 60 

GTGGTAACGG CGGGATGGGC GGGGCTGGCG GGGCTGGCGG CCCCGGCGGG GCCGGCGGCC 120 

TGATCAGCCT GCTGGGCGGC CAAGGCGCCG GCGGGGCCGG CGGGACCGGC GGGGCCGGCG 180 

GTGTTGGCGG TGACGGCGGG GCCGGCGGCC CCGGCAACCA GGCCTTCAAC GCAGGTGCCG 24 0 

GCGGGGCCGG CGGCCTGATC AGCCTGCTGG GCGGCCAAGG CGCCGGCGGG GCCGGCGGGA 300 

CCGGCGGGGC CGGCGGTGTT GGCGGTGAC 32 9 

(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 80 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
Cj) TOPOLOGY: linear 



(XI) SEQUENCE 0E3CRIPTION: SEQ ID NO: 164: 

GCAACGGTGG CAACGGCGGC ACCAGCACGA CCGTGG':;GAT GGCCGGAGGT .AACTGTGGTG 60 

CCGCCGGGCT GATCGGCAA': 8 0 
(2) INFORMATION FiDF. SEQ ID NO: 165: 
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(i) SEQUENCE CHAEIACTERI STTCS : 

(A) LENGTH: 392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

GGGCTGTGTC GCACTCACAC CGCCGCATTC GGCGACGTTG GCCGCCCAAT ATCCAGCTCA 60 

AGGCCTACTA CTTACCGTCG GAGGACCGCC GCATCAAGGT GCGGGTCAGC GCCCAAGGAA 12 0 

TCAAGGTCAT CGACCGCGAC GGGCATCGAG GCCGTCGTCG CGCGGCTCGG GCAGGATCCG 10 0 

CCCCGGCGCA CTTCGCGCGC CAAGCGGGCT CATCGCTCCG AACGGCGGCG ATCCTGTGAG 2^0 

CACAACTGAT GGCGCGCA^C GAGATTCGTC CAATTGTCAA GCCGTGTTCG ACCGCAGGGA 300 

CCGGTTATAC GTATGTCAAC CTATGTCACT CGCAAGAACC GGCATAACGA TCCCGTGATC 360 

CGCCGACAGC CCACGAGTGC AAGACCGTTA CA 3 92 
(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DEoCRIPTION: SEQ ID NO: 166: 

ACCGGCGCCA CCGGCGGCAC CGGGTTCGCC GGTGGCGCCG GCGGGGCCGG CGGGCAGGGC 6 0 

GGTATCAGCG GTGCCGCCCG CACCAACGGC TCTGGTGGCG CTGGCGGCAC CGGCGGACAA 120 

GGCGGCGCCG GGGGCGCTGG CGGGGCCGGC GCCGATAACC CCACCGGCAT CGGCGGCGCC 18 0 

GGCGGCACCG GCGGCACCGG CGGAGCGi^CC GGAGCCGGCG GGGCCGGTGG CGCCATCGGT 24 0 

ACCGGCGGCA ccGG>::GG :G': ggtgggcagc gtcggtaacg :cgi:^gatcgg CGGTACCGGC 300 

GGTACGGGTG GTGTCGGTGC T'^.CTG ^T-'^GT GCAGGTGC^^JG JTG' :':^GCGG'': TGGCAGCAGC 360 
GCTACCGGTG GGGCl^G!": :;TT CG'^CGGCGGC GCCGGi:GGAG .AAGG':GGACC GGGCGGCAAC 420 
AGCGGTGTGG GCGGCACZAA GGGCT'ZCGGC GGGGCGGZCG GTGCAGGCGG CAAGGGCGGC ^80 
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ACCGGAGGTG CCGGCGGGTC CGGCGCGGhC /VACCCCACCG GTGCTGGTTT CCCCG 5 3b 

(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

CCGACGTCGC CGGGGCGATA CGGGGGTCAC CGACTACTAC ATCATCCGCA CCGAGAATCG 60 

GCCGCTGCTG CAACCGCTGC GGGCGGTGCC GGTCATCGGA GATCCGCTGG CCGACCTGAT 120 

CCAGCCGAAC CTGAAGGTGA TCGTCAACCT GGGCTACGGC GACCCGAACT ACGGCTACTC 180 

GACGAGCTAC GCCGATGTGC GAACGCCGTT CGGGCTGTGG CCGAACGTGC CGCCTCAGGT 24 0 

CATCGCCGAT GCCCTGGCCG CCGGAACACA AGAAGGCATC CTTGACTTCA CGGCCGACCT 300 

GCAGGCGCTG TCCGCGCAAC CGCTCACGCT CCCGCAGATC CAGCTGCCGC AACCCGCCGA 360 

TCTGGTGGCC GCGGTGGCCG CCGCACCGAC GCCGGCCGAG GTGGTGAACA CGCTCGCCAG 420 

GATCATCTCA ACCAACTACG CCGTCCTGCT GCCCACCGTG GACATCGCCC TCGCCTGGTC 4 80 

ACCACCCTGC CGCTGTACAC CACCCAACTG TTCGTCAGGC AACTCGCTGC GGGCAATCTG 54 0 

ATCAACGCGA TCGGCTATCC CCTGGCGGCC ACCGTAGGTT TAGGCACGAT CGATAGGGGG 600 

CGGCGTGGAA TTGCTCACCC TCCTCGCGGC GGCCTCGGAC ACCGTTCGAA ACATCGAGGG 6 60 

CCTCGTCACC TAACGGATTC CCGACGGCAT 690 
(2) INFORMATION FOR SEQ ID NO: 168: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 407 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 
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ACGGTGACGG CGGTACTGGC GGCGGCCACG GCGGCAACGG CGGGAATCCC GGGTGGCTCT 60 

TGGGCACAGC CGGGGGTGGC GGCAA::GGTG GCGCCGGCAG CACCGGTACT GCAGGTGGCG 12G 

GCTCTGGGGG CACCGGCGGC GACGGCGGGA CCGGCGGGCG TGGCGGCCTG TT/VATGGGCG 180 

CCGGCGCCGG CGGGCACGGT GGCACTGGCG GCGCGGGCGG TGCCGGTGTC GACGGTGGCG 24 0 

GCGCCGGCGG GGCCGGCGGG GCCGGCGGCA ACGGCGGCGC CGGGGGTCAA GCCGCCCTGC 300 

TGTTCGGGCG CGGCGGCACC GGCGGAGCCG GCGGGTACGG CGGCGATGGC GGTGGCGGCG 360 

GTGACGGCTT CGACGGCACG ATGGCCGGCC TGGGTGGTAC CGGTGGC ^107 
(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

GATCGGTCAG CGCATCGCCC TCGGCGGCAA GCGATTCCGC GGTCTCACCG AAGAACATCG 60 

TGCACGCGGC GGCGCGGACC AGCCCGCTGC GCTGCGGCGC GTCGAACGCC TCCAGCAGGC 120 

ACAGCCAGTC CTTGGCGGCC TGCGAGGCGA ACACGTCGGT GTCACCGGTG TAGATCGCCG 18 0 

GGATGCCCGC CTCCGCCAAC GCATTCCGGC ACGCCCGCGC GTCTTTGTGA TGCTCGACGA 24 0 

TCACCGCGAT GTCTGCGGCC ACCACGGGCC GCCCGGCGAA GGTGGCCCCG CTGGCCAGTA 300 

GCGCCGCGAC GTCGGCGGCC AGGTCGTCGG GGATGTGCCG GCGCAGCGCT CCGGCGCGAC 360 

GCCCGAAAAA CGACCCCTCA CCCAGCTGGG TCCCGCTGGC ATATCCCTTG CCGTCCTGGG 4 20 

CGATATTGGA CGCGCATGCC CCGACCGCGT ACAGGCCGGC CACCACCG 4 68 

(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A; LENGTH: 219 baso pairs 
(F^-!' TYPE: nu(:le.ic acid 

(C) STRy^NDEDNESS : single 

(D) TOPOLOGY: linear 
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(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 



GGTGGTAACG GCGGCCAGGG TGGCATCGGC GGCGCCGGCG AGAGAGGCGC CGACGGCGCC 



60 



GGCCCCAATG CTAACGGCGC AAACGGCGAG AACGGCGGTA GCGGTGGTAA CGGTGGCGAC 



120 



GGCGGCGCCG GCGGCAATGG CGGCGCGGGC GGCAACGCGC AGGCGGCCGG GTACACCGAC 



180 



GGCGCCACGG GCACCGGCGG CGACGGCGGC AACGGCGGC 



219 



(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 494 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

TAGCTCCGGC GAGGGCGGCA AGGGCGGCGA CGGTGGCGAC GGCGGTGACG GCGTCGGCGG 60 

CAACAGTTCC GTCACCCAAG GCGGCAGCGG CGGTGGCGGC GGCGCCGGCG GCGCCGGCGG 120 

CAGCGGCTTT TTCGGCGGCA AGGGCGGCTT CGGCGGCGAC GGCGGTCAGG GCGGCCCCAA 180 

CGGCGGCGGT ACCGTCGGCA CCGTGGCCGG TGGCGGCGGC AACGGCGGTG TCGGCGGCCG 24 0 

GGGCGGCGAC GGCGTCTTTG CCGGTGCCGG CGGCCAGGGC GGCCTCGGTG GGCAGGGCGG 300 

CAATGGCGGC GGCTCCACCG GCGGCAACGG CGGCCTTGGC GGCGCGGGCG GTGGCGGAGG 360 

CAACGCCCCG GCTCGTGCCG AATCCGGGCT GACCATGGAC AGCGCGGCCA AGTTCGCTGC 4 20 

CATCGCATCA GGCGCGTACT GCCCCGAACA CCTGGAACAT CACCCGAGTT AGCGGGGCGC 4 ^^ 0 

ATTTCCTGAT CACC 4 94 
(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1'72: 



GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 



60 



TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 



CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 



190 



GCCAGAGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC 



220 



(2) INFORMATION FOR SEO ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 388 base pairs 

(B) TYPE: nucleic acid 
:C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

ATGGCGGCAA CGGGGGCCCC GGCGGTGCTG GCGGGGCCGG CGACTACAAT TTCCAACGGC f,0 

GGGCAGGGTG GTGCCGGCGG CCAAGGCGGC CAAGGCGGCC TGGGCGGGGC AAGCACCACC 120 

TGATCGGCCT AGCCGCACCC GGGA/VAGCCG ATCCAACAGG CGACGATGCC GCCTTCCTTG 180 

CCGCGTTGGA CCAGGCCGGC ATCACCTACG CTGACCCAGG CCACGCCATA ACGGCCGCCA 2 4 0 

AGGCGATGTG TGGGCTGTGT GCTAACGGCG TAACAGGTCT ACAGCTGGTC GCGGACCTGC 300 

GGGACTACA;^. TCCCGGGCTG ACCATGGACA GCGCGGCC/IA GTTCGCTGCC ATCGCATCAG 3 60 

GCGCGTACTG CCCCGAACAC CTGGAACA 388 

(2) INFORMATION FOR SEQ ID NO : 1 7 4 : 

ix) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 4 00 base pairs 
(3) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
TOPOLOGY: linear 



(Xi) SEC^UENCE DESCRIPTION: SFQ ID NO: 17^1: 
GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG f^O 
ACGGCGGCCh AGGCGGCAGC GGCGGCACCG GCGGCMCGC CGGGGCCGGC GGCACCAGCT 120 
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TCACCCAAGt". CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 180 

GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGCCGCCGGC ACCACAGGCG 24 0 

GCGACGGCGG GGCCGGCGGG GCCGGCGGAA CCGGCGGAAC CGGCGGAGCC GCCGGCACCG 300 

GCACCGGCGG CCAACAAGGC AACGGCGGCA ACGGCGGCAC CGGCGGCAAa GGCGGCACCG 3 60 

GCGGCGACGG TGCACTCTCA GGCAGCACCG GTGGTGCCGG 4 00 
(2} INFORMATION FOR 3EQ ID NO: 175: 

(i) SEOUENCK CHARACTERISTICS: 

(A) LENGTH: 538 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 175: 

GGCAACGGCG GC7VACGGCGG CATCGCCGGC ATTGGGCGGC AACGGCGTTC CGGGACGGGC 60 

AGCGGCAACG GCGGCCAACG GCGGCAGCGG CGGCAACGGC GGCAACGCCG GCATGGGCGG 120 

CAACAGCGGC ACCGGCAGCG GCGACGGCGG TGCCGGCGGG AACGGCGGCG CGGCGGGCAC 180 

GGGCGGCACC GGCGGCGACG GCGGCCTCAC CGGTACTGGC GGCACCGGCG GCAGCGGTGG 24 0 

CACCGGCGGT GACGGCGGTA ACGGCGGCAA CGGAGCAGAT AACACCGCAA ACATGACTGC 300 

GCAGGCGGGr GGTGACGGTG GC7VACGGCGG CGACGGTGGC TTCGGCGGCG GGGCCGGGGC 3 60 

CGGCGGCGGT GGCTTGACCG CTGGCGCCAA CGGCACCGGC GGGCAAGGCG GCGCCGGCGG 4 20 

CGATGGCGGC AACGGGGCCA TCGGCGGCCA CGGCCCACTC ACTGACGACC CCGGCGGCAA 480 

CGGGGGCACC GGCGGCMCG GCGGCACCGG CGGCACCGGC (^GCGCGGGCA TCGGCAGC 5 38 

(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 base pairs 
(?) TYPE: nucJcic acid 

(C) STFANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:176: 

GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 60 

TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 120 

CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 180 

GCCACGGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC CGGTGGTGCC GGCGGCACC 2 39 
(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 985 base pairs 

(B) TYPE: nucleic acid 
(0) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(xi) sequence description: seq id no: 177: 

agcagcgcta ccggtggcgc cgggttcgcc ggcggcgccg gcggagaagg cggagcgggc go 

ggcaacagcg gtgtgggcgg caccaacggc tccggcggcg ccggcggtgc aggcggcaag 120 

ggcggcaccg gaggtgccgg cgggtccggc gcggacaacc ccaccggtgc tggtttcgcc 180 

ggtggcgccg gcggcacagg tggcgcggcc ggcgccggcg gggccggcgg ggcgaccggt 2 '3 0 

accggcggca ccggcggcgt tgtcggcgcc accggtagtg caggcatcgg cggggccggc 300 

ggccgcggcg gtgacggcgg cgatggggcc agcggtctcg gcctgggcct ctccggcttt 360 

gacggcggcc aaggcggcca aggcggggcc ggcggcagcg ccggcgccgg cggcatcaac ^20 

ggg^.ccgg^g gggccggcgg caacggcggc gacggcgggg acggcgcaac cggtgccgca a so 

ggtctc::;G';^g acaacggcgg ggtcggcggt gacggtgggg ccggtggcgc cgccggc/vac .S4 0 

ggcggcaacg cggi^cgtcgg cctgacagcc .aaggccggcg acggcggci^c cgcgggcaat 600 

GGCGGCAACG GGGGCGCCGG CGGTGCTGGC GGGGCCGGCG ACAACAATTT CAACGGCGGC 6 60 

cag:.gtggT':. ccggcggcca aggcggccaa '^.gcggcttgg gcggggcaag cac:acctga 720 

TCGGccTA'y: ':gcaccci:^G''j aaaG'::cgatc caa.caggcga cgatgccgcc tti^cttgcc^:; 7^C 

cgttggaC'::a :,gccggcatc acctA'^gcti;^ acccaggcca cgccataa=:g Gozy:.':AAGG &4o 

cgatgtgtgg gctgtgtgct aacggcgtaa lAGGTi:taca gctG!:^tcgcg GAi::TGCGGi:; ^ro 

AATA'2AATC':^ CGi^GCTGACC ATGl";ACAGCC CCGCCAAGTT CGCTGCCATC GCATCAGGCi"; 9»'0 
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CGTACTGCCC CGAACACCTG GAACA 93 5 

(2) INFORMATION FOR SFQ 10 NO: 178: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2138 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC bO 

CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC l^^O 

ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTTCAGT TTAGCGACGA TAATGGCTAT leO 

AGCACTAAGG AGGATGATCC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 2 10 

AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTCC 300 

CCATCACACC GTGCGAACTC ACGGCGGCTA AAAACGCCGC CCAACAGCTG GTATTGTCCG 3 60 

CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 4 20 

CGCTGCGCAA CGCGGCCAAG GCGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 4 SO 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 54 0 

CGGCCGAACT AACCGATACC CCGAGGGTGG CCACGGCCG^ TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCA^AGG CGCATCGCTC GCGCACTTTG 6 hi' 

CGGATGGGTG GAACACTTTC AACCTGACGC TGC7VAGGCGA CGTCAAGCGG TTCCGGGGGT 7 JO 

TTGAC.AACTG GGAAGGCGAT GCGi^CTACCG CTTGCGAGGC TTCGCTCGAT CAACAACGGC 7}'^0 

AATGGATACT CCACATGGCO AAATTGAGCG CTGCGATG':;': CAAGCAGGCT CAATATGTCG 8-10 

CGCAGCT:;CA CGTGTGGGCT AGGCGGGA^^C ATCCGACTTA TGAAGACATA GTCGGGCTCG 900 

AACG3GTTTA CGCGGAAAA: CCTTCGGCCi: GCGACCA.AM TCTCCCGGTG TACGCGGAGT 900 

ATCAGi:A:;AG GTCGGP^GAAG G'YGCTGAGCG AATACAACAA CMGi^CAGCC CTGGAACCGG lOJO 

Ty\AA':'^CGCC GAAGGCTCCC GCCGCCATCA AGATCGACGG GCCCCCGCCT CCGCAAGAGC 10^^(' 

ai:^ggatt:;at gggT'^gcttc ctgatgccgc cgtctgac-^g ctccggtgtg actcccggta ii^K) 
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CCGGGATGCC AGCCGCACCG ATGGTTCCGC CTACCGGATC GCCGGGTGGT GGCCTCCCGG 1200 

CTGACACGGC GGCGCAGCTG ACGTCGGCTG GGCGGGAAGC CGCAGCGCTG TCGGGCGACG liJbC) 

TGGCGGTCAA AGCGGCATGG CTCGGTGGCG GTGGAGGCGG CGGGGTGCCG TCGGCGCCGT 1320 

TGGGATCCGC GATCGGGGGC GCCGAATCGG TGCGGCCCGC TGGCGCTGGT GACATTGCCG 1380 

GCTTAGGCCA GGGAAGGGCC GGCGGCGGCG CCGCGCTGGG CGGCGGTGGC ATGGGAATGC 144 0 

CGATGGGTGC CGCGCATCAG GGACAAGGGG GCGCCAAGTC CAAGGGTTCT CAGCAGGAAG 1500 

ACGAGGCGCT CTACACCGAG GATCGGGCAT GGACCGAGGC CGTCATTGGT AACCGTCGGC 15 60 

GCCAGGACAG TAAGGAGTCG AAGTGAGCAT GGACGAATTG GACCCGCATG TCGCCCGGGC 1620 

GTTGACGCTG GCGGCGCGGT TTCAGTCGGC CCTAGACGGG ACGCTCAATC AGATGAACAA 1680 

CGGATCCTTC CGCGCCACCG ACGAAGCCGA GACCGTCGAA GTGACGATCA ATGGGCACCA 17 40 

GTGGCTCACC GGCCTGCGCA TCGAAGATGG TTTGCTGAAG AAGCTGGGTG CCGAGGCGGT 1800 

GGCTCAGCGG GTCAACGAGG CGCTGCACAA TGCGCAGGCC GCGGCGTCCG CGTATAACGA 18 60 

CGCGGCGGGC GAGCAGCTGA CCGCTGCGTT ATCGGCCATG TCCCGCGCGA TG7\ACGAAGG 1920 

AATGGCCTAA GCCCATTGTT GCGGTGGTAG CGACTACGCA CCGAATGAGC GCCGCAATGC 198 0 

GGTCATTCAG CGCGCCCGAC ACGGCGTGAG TACGCATTGT CAATGTTTTG ACATGGATCG 2040 

GCCGGGTTCG GAGGGCGCCA TAGTCCTGGT CGCCAATATT GCCGCAGCTA GCTGGTCTTA 2100 

GGTTCGGTTA CGCTGGTTAA TTATGACGTC CGTTACCA 2138 
(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 460 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu He Leu Asn 

1 5 10 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp Val 

20 25 30 

Pro He Thr Pro Cys Glu Leu Thr Ala AJ a Lys A,sn A.l a Ala Gin Gin 
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35 40 45 

Lgu Val Leu Scr Ala Asp Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala 
50 55 60 

Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lys Ala 
65 70 75 80 

Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
85 90 95 

Glu Gly Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 
100 105 110 

Ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 

Asn Phe Met Asp Leu Lys Glu Ala Ala Arq Lys Leu Glu Thr Gly Asp 
130 13b 140 

Gin Gly Ala Ser Leu Ala His Phe Ala Asp Gly Trp Asn Thr Phe Asn 
145 150 155 160 

Leu Thr Leu Gin Gly Asp Val Lys Arg Phe Arg Gly Phe Asp Asn Trp 
165 170 175 

Glu Gly Asp Ala Ala Thr Ala Cys Glu Ala Ser Leu Asp Gin Gin Arg 
180 185 190 

Gin Trp lie Leu His Met Ala Lys Leu Ser Ala Ala Met Ala Lys Gin 
195 200 205 

Ala Gin Tyr Val Ala Gin Leu His Val Trp Ala Arg Arg Glu His Pro 
210 215 220 

Thr Tyr Glu Asp lie Val (^ly Leu Glu Arg Leu Tyr Ala Glu Asn Pro 
225 230 235 240 

Ser Ala Arg Asp Gin He Leu Pro Val Tyr Ala Glu Tyr Gin Gin Arg 
245 250 255 

Ser Glu Lys Val Leu Thr Glu Tyr Asn Asn Lys Ala Ala Leu Glu Pro 
260 265 270 

Val Asn Pro Pro Lys Pro Pro Pro Ala He Lys Tie Asp Pro Pro Pro 
275 280 285 

Pro Pro Gin Glu Gin Gly Leu He Pro Gly Phe Leu Met Pro Pro Ser 
290 295 300 

Asp Gly Ser Gly Val Thr Pro Gly Thr Gly Met Pro Ala Ala Pro Met 
305 310 315 320 



Val Pro Pro Thr Gly Ser Pro Gly Gly Gly Leu Pro A] a Asp Thr Ala 
325 330 335 
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Ala Gin Leu Thr Ser Ala Gly Arq Glu Ala Ala Ala Leu Ser Gly Asp 
340 345 350 

Val Ala Val Lys Ala Ala Ser Leu Gly Gly GJ y Gly Gly Gly Gly Val 
355 360 365 

Pro Ser Ala Pro Leu Gly Ser Ala He Gly Gly Ala Glu Ser Val Arq 
370 375 380 

Pro Aid Gly Ala Gly Asp He Ala Gly Leu Gly Gin Gly Arg Ala Gly 
385 390 395 400 

Gly Gly Ala Ala Leu Gly Gly Gly Gly Met Gly Met Pro Met Gly Ala 
405 410 415 

Ala His Gin Gly Gin Gly Gly Ala Lys Ser Lys Gly Ser Gin Gin Glu 
420 425 430 

Asp Glu Ala Leu Tyr Thr Glu Asp Arg A.l a Trp Thr Glu Ala Val He 
435 440 445 

Gly Asn Arg Arg Arg Gin Asp Ser Lys Glu Ser Lys 
450 455 460 

(2) INFORMATION FOR SEQ ID NO:180: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 
(D) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

Aia Gly Asn Val Thr Ser A] a Ser Gly Pro His Arg Phe Gly Ala Pro 

I 5 3 0 15 

Asp Arg Gly Ser Gin Arq Arq Arq Arq His Pro Ala Ala Ser Thr Ala 

20 25 30 

Thr Glu Arg Cys Arg Phe Asp Arg His Val Ala Arg Gin Arg Cys Gly 

35 4 0 4 5 

:'he Pro Pro Ser Arg Arq Gin Leu Arg Arq Arq Va.l Ser Arg Glu Ala 

50 5 5 60 

Thr Thr Arq Arg Ser G] y Arq Arq Asn His Arq Cys Gly Trp His Pro 

6b 70 75 00 

Gly Tnr Gly Ser His Thr Gly Aia Val Arg Arq Arg His Gin Glu Ala 
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85 90 95 

Arg Asp Gin Ser Leu Leu Leu Arg Arg Arg Gly Arg Va] Asp Leu Asp 
100 105 110 

Gly Gly Gly Arg Leu Arg Arg Val Tyr Arg Phe Gin Gly Cys Leu Val 
115 120 125 

Val Val Phe Gly Gin His Leu Leu Arg Pro Leu Leu He Leu Arg Val 
130 135 140 

His Arg Glu Asn Leu Val Ala Gly Arg Arg Val Phe Arg Val Lys Pro 
145 150 155 160 

Phe Glu Pro Asp Tyr Val Phe He Scr Arg Met Phe Pro Pro Ser Pro 
165 170 175 

His Val Gin Leu Arg Asp He Leu Ser Leu Leu Gly His Arg Ser Ala 
180 185 190 

Gin Phe G3 y His Val Glu Tyr Pro Leu Pro Leu Leu He Glu Arg Ser 
195 200 205 

Leu Ala Ser Gly Ser Arg He Ala Phe Pro Val Val Lys Pro Pro Glu 
210 215 220 

Pro Leu Asp Val Ala Leu Gin Arg Gin Val Glu Ser Val Pro Pro He 
225 230 235 240 

Arg Lys Val Arg Glu Arg Cys Ala Leu Val Ala Arg Phe Glu Leu Pro 
245 250 255 

Cys Arg Phe Phe Glu He His Glu Val Gly Phe Thr Gly Arg Gly His 
260 265 270 

Pro Arg Arg He Gly 
275 

(2) INFORMATION FOR SEQ ID N0H81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
(D) TYPE: amJ no acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:181: 

Arg Va.l. Ma Ala Ser Phe Ho Asp Trp Leu Asp Ser Pro Asp Ser Pro 
1 5 10 15 
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Lea A.sp Pro Ser Leu Val Ser Ser Leu Leu Asn Ala Val Ser Cys Gly 
20 25 30 

Ala Glu Ser Ser Ala Ser Ser Ser Ala Arq Ser Gly Asn Gly Ser Arg 
35 40 45 

Trp Thr Ser Met Pro Ser Gly Thr Arg Pro Gly Pro Arg Arg Ala Thr 
50 55 60 

Ser Arg Asp Asp Arq Arg Ser Ala Thr Ser Val lie Pro Ser Arg Arg 
65 70 75 80 

Ser Val Ala Pro Arg Ala Glu Phe Gly Thr Arg Leu Ala Ser His Arg 
85 90 95 

Ala Ser Pro Ser Asn Ala Cys Pro Val Arg lie Val Thr Ser Ala Ser 
100 105 110 

Gly Arq Pro lie Ser Ser Pro Pro lie Val Arg Ser Arg Ser Cys Val 
115 120 125 

Asp Lys Asn Gly Arg Arg Cys Ala Ser Gly Tyr Arg Arg Leu Asn Arg 
130 135 140 

Ala Arg Ser Ser Ser lie Ala Ala Arg Cys Arg Thr lie Gly Thr Phe 
145 150 155 160 

Arg Arg Ser Arg Tyr Ser Ala Ser Met Arg Val Ser Thr Asn Ser Pro 
165 170 175 

His Val Thr His Gly Val Ala Pro Gly Val Thr Arg Arg lie Gly Gly 
180 185 190 



(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 196 amino acids 
\B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(XI ) SEQUENCE DE,SCRIPTION: SEQ ID NO: 182: 

CIn Glj Arq Fro Gin Met Cys Gin Arg Val Ser Glu He Glu Pro Arq 
1 .S 10 15 

I'hr Gin Phe Phe Asn Arg Cys Ala Leu Pro His Tyr Trp His Phe Pro 
20 25 30 



Ala Val Ala Val Phe Ser Lys His Ala Ser Leu Asp Glu Leu Ala Pr^^ 
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3b 40 45 

Arq Asn Pro Arg Arg Ser Ser Arg Arg Asp Ala GIu Asp Arg Arg Val 
50 55 60 

lie Phe Ala Ala Thr Leu Val Ala Val Asp Pro Pro Leu Arg Gly Ala 
65 70 75 80 

Gly Gly Glu Ala Asp Gin Leu He Asp Leu Gly Val Cys Arg Arg Gin 

85 90 95 

Ala Gly Arg Val Arg Arg Gly Gin Glu Leu His His Arg His Arg His 
100 105 110 

Gin Gly Ala Ala Pro Asp Leu Arg Arg Arg Arg Arg His Arg Arg Val 
115 120 125 

Gin Gin His Arg Arg Leu Gin Arg Val Arg Gin Leu Arg Arg Tyr Val 
130 135 140 

Gin Thr Ala His His Arg Arg Phe Ala Arg Thr Asp Arg Val Arg His 
145 150 155 160 

His Val Arg G.l y Pro Ser Asn His Arg Arg Arg Arg Val Tyr Arg Gly 
165 170 175 

Arg His Ser Gly Ala Gly Gly Cys Pro Ala Gly Gly Ala Gly Ser Val 
180 185 190 

Gly Gly Ser Ala 
195 

(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TC^POLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEC' ID NO: 183: 

Val Arg Cys Gly Thr Leu Val Pro Val Pro Met Val Glu Phe Leu Thr 
15 10 15 

Ser Thr Asn Ala Pro Ser Leu Fto Sor Ala Tyr Ala Glu Val Asp Lys 
20 2b 30 



Leu He Gly Leu Pro Ala Gly Thr Ala Lys Arg Trp He Asn Gly Tyr 
3 5 4 0 4 5 
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Glu Arg Gly Gly Lys Asp His Pro Pro lie Leu Arg Vai Thr Fro Gly 
50 55 60 

Ala Thr Pro Trp Val Thr Trp Gly Glu Phe Val Glu Thr Arg Met Leu 
65 70 75 80 

Ala Glu Tyr Arg Asp Arg Arg Lys Vai Pro He Val Arg Gin Arg Ala 
85 90 95 

Ala He Glu Glu Leu Arg Ala Arg Phe Asn Leu Arg Tyr Pro Leu Ala 
100 105 110 

His Leu Arg Pro Phc Leu Ser Thr His Glu Arg Asp Leu Thr Met Gly 
lis 120 125 

Gly Glu Glu tie Gly Leu Pro Asp Ala Glu Val Thr He Arg Thr Gly 
130 135 140 

Gin Ala Leu Leu Gly Asp Ala Arg Trp Leu Ala Ser Leu Val Pro Asn 
145 150 155 160 

Ser Ala Arg Gly Ala Thr Leu Arg Arg Leu Gly He Thr Asp Val Ala 
165 170 175 

Asp Leu Arg Ser Ser Arg Glu Val Ala Arg Arg Gly Pro Gly Arg Val 
180 185 190 

Pro Asp Gly He Asp Val His Leu Leu Pro Phe Pro Asp Leu Ala Asp 
195 200 205 

Asp Asp Ala Asp Asp Ser Ala Pro His Glu Thr Ala Phe Lys Arg Leu 
210 215 220 

Leu Thr Asn Asp Gly Ser Asn Gly Glu Ser Gly Glu Ser Ser Gin Ser 
225 230 235 2^0 

He Asn Asp Ala Ala Thr Arg Tyr Met Thr Asp Glu Tyr Arg Gin Phe 
245 250 255 

Pro Thr Arg Asn Gly Ala Gin Arg Ala Leu His Arg Val Val Thr Leu 
260 265 270 

Lou Ala Ala Gly Arg Pro Val Leu Thr His Cys Phe Ala Gly Lys Asp 
275 280 285 

Arg Thr Gly Phe Val Val Ala Leu Val Leu Glu Ala Val Gly Leu Asp 
290 295 300 

Arg Asp Val He Val A.l a Asp 
305, 310 

(2) INFORMATION FOR SFQ ID NO: 184: 

(.i. ) GFQUENCE CHARACTERISTICS: 

(A) LENGTH: 2072 base pairs 
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(B) TYPE: nucleic acid 

(C) 5TRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

CTCGTGCCGA TTCGGCACGA GCTGAGCAGC CCAAGGGGCC GTTCGGCGAA GTCATCGAGG 60 

CATTCGCCGA CGGGCTGGCC GGCMGGGTA AGCAAATCAA CACCACGCTG AACAGCCTGT 1:^0 

CGCAGGCGTT GAACGCCTTG AATGAGGGCC GCGGCGACTT CTTCGCGGTG GTACGCAGCC 180 

TGGCGCTATT CGTCAACGCG CTACATCAGG ACGACCAACA GTTCGTCGCG TTGAACAAGA 2 AO 

ACCTTGCGGA GTTCACCGAC AGGTTGACCC ACTCCGATGC GGACCTGTCG AACGCCATCC 300 

AGCAATTCGA CAGCTTGCTC GCCGTCGCGC GCCCGTTCTT CGCCAAGAAC CGCGAGGTGC 3 60 

TGACGCATGA CGTCAATAAT CTCGCGACCG TGACCACCAC GTTGCTGCAG CCCGATCCGT 4 20 

TGGATGGGTT GGAGACCGTC CTCCACATCT TCCCGACGCT GGCGGCGAAC ATT7VACCAGC 4 80 

TTTACCATCC GACACACGGT GGCGTGGTGT CGCTTTCCGC GTTCACGAAT TTCGCCAACC 54 0 

CGATGGAGTT CATCTGCAGC TCGATTCAGG CGGGTAGCCG GCTCGGTTAT CAAGAGTCGG 600 

CCGAACTCTG TGCGCAGTAT CTGGCGCCAG TCCTCGATGC GATCAAGTTC AACTACTTTC 660 

CGTTCGGCCT GAACGTGGCC AGCACCGCCT CGACACTGCC TAAAGAGATC GCGTACTCCG 720 

AGCCCCGCTT GCAGCCGCCC AACGGGTACA AGGACACCAC GGTGCCCGGC ATCTGGGTGC 7 80 

CGGATACGCC GTHSTCACAC CGCAACACGC AGCCCGGTTG GGTGGTGGCA CCCGGGATGC 84 0 

AAGGGGTTCA GGTGGGACCG ATCACGCAGG GTTTGCTGAC GCCGGAGTCC CTGGCCGAAC '^OO 

TCATGGGTGG TCCCGATATC GCCCCTCCGT CGTCAGGGCT GCAAACCCCG CCCGGACCCC 960 

CGAATGCGTA CGACGAGTAC CCCGTGCTGC CGCCGATCGG TTTACAGGCC CCACAGGTGC 102 0 

CGATACCACC GCCGCCTCCT GGGCCCGACG TAATCCCGGG TCCGGTGCGA CCGGTCTTGG 108 0 

CGGCGATCGT GTTCCCAAGA GATCGCCCGG CAGCGTCGGA .AAAGTTCGAC TACATGGGCC 114 0 

TCTTGTTGCT GTCGCCGGGC CT(-,GCGACCT TCCTGTTC<:^.G GGT'irrCATCT AGCCrr.GCCC 12 00 

GTGGAACGAT 'JGCCGATCi^^''^ i:ACGTGTTGA TACCGGCGAT CACGGGCGTi:; > -Ci^/rTGATCG 12t)0 

CGGCATTCGT Cl^CACATTC':. TGi^iTACCGCA CAGAACATCG GCTGATAGAC ATl^CGCTTGT 1320 

TCCAGAACCG AGCGGTCGCG GAGGCCAACA TGACGATGAC GGTGCTCTiTC ':TCGGGCTGT 138 0 
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TTGGCTCCTT CTTGCTGCTC CCGAGCTACC TCCAGCAAGT GTTGCACCAA TCACCGATGC 1A40 

AATCGGGGGT GCATATCATC CCACAGGGCC TCGGTGCCAT GCTGvGCGATG CCGATCGCCG 1500 

GAGCGATGAT GGACCGACGG GGACCGGCCA AGATCGTGCT GGTTGGGATC ATGCTGATCG 1560 

CTGCGGGGTT GGGCACCTTC GCCTTTGGTG TCGCGCGGCA AGCGGACTAC TTACCCATTC 1620 

TGCCGACCGG GCTGGCAATC ATGGGCATGG GCATGGGCTG CTCCATGATG CCACTGTCCG 1680 

GGGCGGCAGT GCAGACCCTG GCCCCAGATC AGATCGCTCG CGGTTCGACG CTGATCAGCG 174 0 

TCAACCAGCA GGTGGGCGGT TCGATAGGGA CCGCACTGAT GTCGGTGCTG CTCACCTACC 1800 

AGTTCiAATCA CAGCGAAATC ATCGCTACTG CAAAGAAAGT CGCACTGACC CCAGAGAGTG 18 60 

GCGCCGGGCG GGGGGCGGCG GTTGACCCTT CCTCGCTAGC GCGCCAAACC AACTTCGCGG 19:::0 

CCC/VACTGCT GCATGACCTT TCGCACGCCT ACGCGGTGGT ATTCGTGATA GCGACCGCGC 198 0 

TAGTGGTCTC GACGCTGATC CCCGCGGCAT TCCTGCCGAA ACAGCAGGCT AGTCATCGAA 2 04'") 
GAGCACCGTT GCTATCCGCA TGACGTCTGC TT 207 2 

(2) INFORMATION FOR SEQ ID NO: 18 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1923 base pairs 

(B) TYPE: nucleic acid 

(C) 3TRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 
TCACCCCGGA GAAGTCGTTC GTCGACGACC TGGACATC :;A CTCGCTGTCG ATGGTCGAGA OC) 

TCGCCGTGCA GACCGAGGAC AAGTACGGCG TCAAGATCCC CGACGAGGAC CTCGCCGGTC 12 0 

TGCGTACCGT CGGTGACGTT GTCGCCTACA TCCAGAAGCT CGAGGAAGAA AJ\CCCGGAGG IHO 

CGGCTCAGGC GTTGCGCGCG AAGATTGAGT CGGAGAACCC CGATGCGGCA CGAGCAGATC 24 0 

ggtgcgttti: acccacatcg caagctcgag acgcccgtcc tcctgttgca cgctcagcca 300 

ggttggcgti:-; tcgccgcctt ccagcaagtg ttcccac':ac acga^agggac cctcgcgaaa 3 60 

ggtgactgat ccgcggacca catagtcgat gccaccgtgg ctgacaattg cgccgggtcc 420 

GAGTTGGCGG GGGCCGA/iTT GCGGCATTGC GTCGAAGC^CC AGCGGATCCi: GGCGCCCGCC 4 80 
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CGGCGTGGCT GGTGTTTTGG GCCGCCGGAT GGCCACGACG 

GAACAGCGCC ACGGC7VATCA CGACCAGCAG ATTTCCCACG 

CGCCGCGGTT GGTCGATCGG TCGCATATCG ATGGCGCCGT 

GGACCGGGGG TCACAACGGG CGAGTTGTCC GGCCGGGAAC 

TCACCCCAGC TCACTGGTGC ACCATCCGGG TGTCGGTGAG 

AACGGCAACG GTTTCTCAGG TCACCAGCTC MCCTCGACC 

GACCGCGCGC AGGTCGCGAG TCAGCAGCTT TGCGCCGGCA 

CAGGGCATCG TAGGTTGCGC CACCGGTGAC ATCGTGCTCG 

GCGATATGAG CAGGCATCCA GTGCCAGGTA GTTGCTGGAG 

GTGGACGGCA ACAGGGGCAA TACGATGCGG CGGTGGTAGC 

TTCCACAGCC GCGTGCGCGA TCAGATGGAC GCCACGGTTG 

GTGCCCTTCG TGCCAGGTCG CGAATCCGGC AACCAGCACG 

CGCCGTGTGC GATCGAGCGT TTCCCGAACG ATTTCGTCGG 

TCTGGCCGTG CGACGAGAAC CGAGCCTTCC CGAACGAGTT 

TCAATCTCGA TGCGCCCATC GCGCTCGGTG ATCTCCACCT 

AGGCGCTCGC GAATCCGCTT GGGAATCACC AGACGTCCTG 

ATGGTAGGAA ATTTACCATC GCACGTTCCA TAGGCGTGTC 

ATCCGCTAGC GTATCGAACG ATTGTTTCGG AAATGGCTGA 

ATGGGTGTCG ATCCCGGGTT GACCCGATGC GGGCTGTCGC 

CGGCAGCTCA CCGCGCTGGA TGTCGACGTG GTGCGCACAC 

CAGCGCCTGT TGGCCATCAG i:GATGCCGTC GAGCACTGGC 

GTGGTGGCTA TCGAACGGGT GTTCTGTCAG CTCAACGTGA 

CAGGCCGGCG GCGTGATCGC CCTGGCGGCG GCCAAA.CGTG 

ACCCCCAGCG AGGTCAAGGC GGCGGTCACT GGCAACGGTT 
ACG 

{?) INFORMATION FOR SFQ ID NO: 186: 

(1) SEQUENGE CHARACTERISTICS: 

(A) LENGTH: L055 base pairs 

(B) TYPE: nucleic acid 



AGAACGACGA TGGCGGCGAT 54 0 

CATACCCTCT CGTACCGCTG *300 

TTAACGTAAC AGCTTTCGCG 660 

CCGGCAGGTC TCGGCCGCGG 720 

CGTGCAACTC AAACACACTC 7 80 

CGCAATCGCT CGTACGTTTC QAO 

GCTTTCGCCG TGAAGCCGAC 900 

GCGAGGTGGT CGGTCAAGCC 960 

GTGATGTCCG CCAAGTAGGC 102 0 

CGGGTCAAGA CCGAATAGGT 108 0 

AGCGCGCGCA CGGCGGCCTC 1140 

CTGGTGTCTG GTGCGATCAC 1200 

TCAACGGGGG CAGGGGACGT 12 60 

CGACACCGGT CGGGGCCGGC 1320 

GGTCGTTCCC GCGCAAGCCA 138 0 

CGACATCGAT GGTTGTTCGC 14 40 

CTGCGCGGGA TGTCGGGACG IbOO 

GGGAGCGTGC GGTGCGGGTG 15)60 

TCATCGAGAG TGGGCGTGGT 162 0 

CGT«:GGATGC GGCCTTGGCG 16B0 

TGGACACCCA TCATCCGGAG 174 0 

CCACGGTGAT GGGCACCGCG 1^^00 

GTGTCGACGT GCATTTCCAT 1860 

CCGCAGACAA GGCTCAGGTC 1^2 0 

1 9: 13 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:166: 

CTGGCGTGCC AGTGTCACCG GCGATATGAC GTCGGCATTC AATTTCGCGG CCCCGCCGGA 60 

CCCGTCGCCA CCCAATCTGG ACCACCCGGT CCGTCAATTG CCGAAGGTCG CCAAGTGCGT 12 0 

GCCCAATGTG GTGCTGGGTT TCTTGAACGA AGGCCTGCCG TATCGGGTGC CCTACCCCCA 180 

AACAACGCCA GTCCAGGAAT CCGGTCCCGC GCGGCCGATT CCCAGCGGCA TCTGCTAGCC 24 0 

GGGGATGGTT CAGACGTAAC GGTTGGCTAG GTCGAAACCC GCGCCAGGGC CGCTGGACGG 300 

GCTCATGGCA GCGAAATTAG AAAACCCGGG ATATTGTCCG CGGATTGTCA TACGATGCTG 3 60 

AGTGCTTGGT GGTTCGTGTT TAGCCATTGA GTGTGGATGT GTTGAGACCC TGGCCTGGAA 4 20 

GGGGACAACG TGCTTTTGCC TCTTGGTCCG CCTTTGCCGC CCGACGCGGT GGTGGCGAAA 4 80 

CGGGCTGAGT CGGGAATGCT CGGCGGGTTG TCGGTTCCGC TCAGCTGGGG AGTGGCTGTG 54 0 

CCACCCGATG ATTATGACCA CTGGGCGCCT GCGCCGGAGG ACGGCGCCGA TGTCGATGTC 600 

CAGGCGGCCG AAGGGGCGGA CGCAGAGGCC GCGGCCATGG ACGAGTGGGA TGAGTGGCAG 660 

GCGTGGAACG AGTGGGTGGC GGAGAACGCT GAACCCCGCT TTGAGGTGCC ACGGAGTAGC 7:^0 

AGCAGCGTGA TTCCGCATTC TCCGGCGGCC GGCTAGGAGA GGGGGCGCAG ACTGTCGTTA 7 80 

TTTGACCAGT GATCGGCGGT CTCGGTGTTC CCGCGGCCGG CTATGACAAC AGTCAATGTG 84 0 

CATGACAAGT TACAGGTATT AGGTCCAGGT TCAACAAGGA (^ACAGGCAAC ATGGCAACAC 900 

GTTTTATGAC GGATCCGCAC GCGATGCGGG ACATGGCGGG CCGTTTTGAi^ GTGCACGCCC 960 

AGACGGTGGA GGACGAGGCT CGCCGGATGT GGGCGTCCGC GCAAAACATC TCGGGNGCGG 1020 
GCTGGAGTGG CATGGCCGAG GCGACCTCGC TAGAC 1055 
(2) INFORMATION FOR SEQ ID NO: 187: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 359 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

CCGCCTCGTT GTTGGCATAC TCCGCCGCGG CCGCCTCGAC CGCACTGGCC GTGGCGTGTG 60 

TCCGGGCTGA CCACCGGGAT CGCCGAACCA TCCGAGATCA CCTCGCAATG ATCCACCTCG 120 

CGCAGCTGGT CACCCAGCCA CCGGGCGGTG TGCGACAGCG CCTGCATCAC CTTGGTATAG IBO 

CCGTCGCGCC CCAGCCGCAG G7VAGTTGTAG TACTGGCCCA CCACCTGGTT ACCGGGACGG 24 0 

GAGAAGTTCA GGGTGAAGGT CGGCATGTCG CCGCCGAGGT AGTTGACCCG GAAAACCAGA 300 

TCCTCCGGCA GGTGCTCGGG CCCGCGCCAC ACGACAAACC CGACGCCGGG ATAGGTCAG 35 9 

(2) INFORMATIOTJ FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 base pairs 
(D) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

AACGGGCCCG TGGGCACCGC TCCTCTAAGG GCTCTCGTTG GTCGCATGAA GTGCTGGAAG 6 0 

GATGCATCTT GGCAGATTCC CGCCAGAGCA AAACAGCCGC TAGTCCTAGT CCGAGTCGCC 120 

CGCAAAGTTC CTCGAATAAC TCCGTACCCG GAGCGCCAAA CCGGGTCTCC TTCGCTAAGC 180 

TGCGCGAACC ACTTGAGGTT CCGGGACTCC TTGACGTCCA GACCGATTCG TTCGAGTGGC 24 0 

TGATCGGTTC GCCGCGCTGG CGCGAATCCG CCGCCGAGCG GGGTGATGTC AACCCAGTGG 300 

GTGGCCTGGA AGAGGTGCTC TACGAGCTGT CTCCGATCGA GGACTTCTCC 350 
(2) INFORMATION FOR SEQ ID NO: 189: 

(i) SFCUE!ICE CHARACTERISTICS: 

(A) LENGTH: 679 amino acids 

(B) TYPE: ammo acid 

(C) STRANDEDNESS : 

([.') TOPOLOGY: .linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 
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Giu Gin Pro Lys GJy Pro Phe Gly Glu Val lie Glu Ala Phe Ala Asp 
15 10 15 

Gly Leu Ala Gly Lys Gly Lys Gin He Asn Thr Thr Leu Asn Ser Leu 
20 25 30 

Ser Gin Ala Leu Asn Ala Leu Asn Glu Gly Arg Gly Asp Phe Phe Ala 
35 40 45 

Val Val Arg Ser Leu Ala Leu Phe Val Asn Ala Leu His G.ln Asp Asp 
50 55 60 

Gin Gin Phe Val Ala Leu Asn Lys Asn Leu Ala Glu Phe Thr Asp Arq 
65 70 75 80 

Leu Thr His Ser Asp Ala Asp Leu Ser Asn Ala He Gin Gin Phe Asp 
85 90 95 

Ser Leu Leu Ala Val Ala Arg Pro Phe Phe Ala Lys Asn Arg Glu Val 
100 105 110 

Leu Thr His Asp Val Asn Asn Leu Ala Thr Val Thr Thr Thr Leu Leu 
115 120 125 

Gin Pro Asp Pro Leu Asp Gly Leu Glu Thr Val Leu His I]e Phe Pro 
130 135 140 

Thr Leu Ala Ala Asn He Asn Gin Leu Tyr His Pro Thr His Gly Gly 
145 150 155 160 

Val Val Ser Leu Ser Ala Phe Thr Asn Phe Ala Asn Pro Met Glu Phe 
165 170 175 

He Cys Ser Ser He Gin Ala Gly Ser Arg Leu Gly Tyr Gin Glu Ser 
180 185 190 

Ala Glu Leu Cys Ala Gin Tyr Leu Ala Pro Val Leu Asp Ala He Lys 
195 200 205 

Phe Asn Tyr Phe Pro Phe Gly Leu Asn Val Ala Ser Thr Ala Ser Thr 
210 215 220 

Leu Pro Lys Glu He Ala Tyr Ser Glu Pro Arg Leu Gin Pro Pro Asn 
225 230 235 240 

Gly Tyr Lys Asp Thr Thr Val Pro Gly He Trp Val Pro Asp Thr Pro 
245 250 255 

Leu Ser His Arg Asn Thr Gin Pro Gly Trp Val Val Ala Pro Gly Met 
260 265 270 



Gin Gly Val Gin Val Gly Pro Ho Thr Gin Gly Leu Leu Thr Pro Giu 
275 280 285 
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Ser Leu Ala Glu Leu Met Giy Gly Pro Asp He Ala Pro Pro Ser Ser 
290 295 300 

Gly Leu Gin Thr Pro Pro Gly Pro Pro Asn Ala Tyr Asp Glu Tyr Pro 
305 310 315 320 

Val Leu Pro Pro He Gly Leu Gin Ala Pro Gin Val Pro He Pro Pro 
325 330 335 

Pro Pro Pro Gly Pro Asp Val He Pro Gly Pro Val Pro Pro Val Leu 
3^0 345 350 

Ala Ala He Val Phe Pro Arg Asp Arg Pro Ala Ala Ser Glu Asn Phe 
355 360 365 

Asp Tyr Met Gly Leu Leu Leu Leu Ser Pro Gly Leu Ala Thr Phe Leu 
370 375 380 

Phe Gly Val Ser Ser Ser Pro Ala Arg Gly Thr Met Ala Asp Arg His 
385 390 395 400 

Val Leu He Pro Ala He Thr Gly Leu Ala Leu He Ala Ala Phe Val 
405 410 415 

Ala His Ser Trp Tyr Arg Thr Glu His Pro Leu He Asp Met Arg Leu 
420 425 430 

Phe Gin Asn Arg Ala Val Ala Gin Ala Asn Met Thr Met Thr Val Leu 
435 440 445 

Ser Leu Gly Leu Phe Gly Ser Phe Leu Leu Leu Pro Ser Tyr Leu Gin 
450 455 460 

Gin Val Leu His Gin Ser Pro Met Gin Ser Gly Val His He He Pro 
465 470 475 480 

Gin G]y Leu Gly Ala Met Leu Ala Met Pro He Ala Gly Ala Met Met 
485 490 495 

Asp Arq Arg Gly Pro Ala Lys He Val Leu Val Gly He Met Leu He 
500 505 510 

Ala Ala Gly Leu Gly Thr Phe Ala Phe Gly Val Ala Arg Gin Ala Asp 
515 520 525 

Tyr Leu Pro He Leu Pro Thr Gly Leu Ala He Met Gly Met Gly Met 
530 535 540 

Gly Cys Ser Met Met Pro Leu Ser GJ y Ala Ala Val Gin Thr Leu Ala 
545 550 555 560 

Pro His Gin He Ala Arg Giy Ser Thr Leu He Ser Val Asn Gin Gin 
565 570 575 



Val Gly Gly Ser He Gly Thr Ala Leu Met Ser Val Leu Leu Thr Tyr 
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580 585 590 

Gin Phe Asn His Ser Glu lie lie Ala Thr Ala Lys Lys Val Ala Leu 
595 600 605 

Thr Pro Glu Ser Gly Ala Gly Arg Gly Ala Ala Val Asp Pro Ser Ser 
610 615 620 

Leu Pro Arg Gin Thr Asn Phe Ala AJa Gin Leu Leu His Asp Leu Ser 
625 630 635 640 

His Ala Tyr Ala Val Val Phe Val He Ala Thr Ala Leu Val Val Ser 
645 650 655 

Thr Leu T.ie Pro Ala Ala Phe Leu Pro Lys Gin Gin Ala Ser His Arg 
660 665 670 

Arg Ala Pro Leu Leu Ser Ala 
675 

(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

Thr Pro Glu Lys Ser Phe Val Asp Asp Leu Asp lie Asp Ser Leu Ser 
15 10 15 

Met Val GJu IJe Ala Val Gin Thr Glu Asp Lys Tyr Gly Val Lys He 

20 25 30 

Pro Asp Glu Asp Leu Ala Gly Leu Arg Thr Val Gly Asp Val Val Ala 
3.5 4 0 4 5 

Tyr He Gin Lys Leu Glu Glu Glu Asn Pro Glu Ala Ala Gin Ala Leu 

50 55 60 

Arg Ala Lys lie Glu Ser Glu Asn Pro Asp Ala Ala Arg Ala Asp Arg 

65 70 7b 80 

Cys Val Ser Pro Thr Ser Gin Ala Arg Asp Ala Arg Arg Pro Leu Ala 

H5 90 95 



Arg Scr Ala Arg Leu Ala Cys Arq Arq Leu Pro Ala Ser Val Pro Thr 
100 105 110 
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Thr Arg Arq Asp Pro Arg Glu Arg 
115 120 

(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:191: 

Leu Ala Cys Gin Cys His Arg Arg Tyr Asp Val Gly X.le Gin Phe Arg 
15 10 15 

Gly Pro Ala Gly Pro Val Ala Thr Gin Ser Gly Pro Pro Gly Pro Ser 
20 25 30 

lie Ala Glu Gly Arg Gin Val Arg Ala Gin Cys Gly Ala Gly Phe Leu 
35 40 45 

Glu Arg Arg Pro Ala Val Ser Gly Ala Leu Pro Pro Asn Asn Ala Ser 
50 55 60 

Pro Gly lie Arg Ser Arg Ala Ala Asp Ser Gin Arg His Leu Leu Ala 
65 70 75 80 

Gly Asp Gly Ser Asp Val Thr Val Gly 
85 

(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 119 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 192: 

Ala Ser Leu T.eu Ala Tyr Ser Ala Ala Ala Ala Ser Thr Ala Leu Ala 
1 5 10 15 

Val Ala Cys Va i Arq Ala Asp His Arg Asp Arq Arg Thr lie Arg Asp 
20 25 30 
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His Leu Ala Met lie His Leu Ala Gin Leu Val Thr Gin Pro Pro Gly 
35 40 45 

Gly Vdl Arg Gin Arrj Leu His His Leu Gly lie Ala Val Ala Pro Gin 
50 55 60 

Pro Gin Glu Val Val Val Leu Ala His His Leu Val Thr Gly Thr Gly 
65 70 75 80 

Glu Val Gin Gly Glu Gly Arg His Val Ala Ala Glu Val Val Asp Pro 
85 90 95 

Glu Asn Gin lie Leu Arg Gin Val Leu Gly Pro Ala Pro His Asp Lys 
100 105 110 

Pro Asp Ala Gly lie Gly Gin 
115 

(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: ].inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Arg Ala Arg Gly His Arg Ser Ser Lys Gly Ser Arg Trp Scr His Glu 
15 10 15 

Val Leu Glu Gly Cys lie Leu Ala Asp Ser Arg Gin Ser Lys Thr Ala 
20 2b 30 

Ala Ser Pro Ser Pro Ser Arg Pro Gin Ser Ser Ser Asn Asn Ser Val 
35 4 0 4 5 

Pro Gly Ala Pro Asn Arg Val Ser Phe Ala Lys Leu Arg Glu Pro Leu 
50 55 60 

Glu Val Pro Gly Leu Leu Asp Val Gin Thr Asp Ser Phe Glu Trp Leu 
65 70 75 80 

Tie Gly Ser Pro Arg Trp Arg Glu Ser Ala Ala Glu Arg Gly Asp Val 

85 90 95 

Asn Pro Vai Gly GLy T,e\] Glu Glu Val Leu Tyr Glu Leu Ser Pro lie 
100 105 110 



Glu Asp Phe Ser 
11 5 
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{?.) INFORMATION FOR SFQ ID NO: 1^(4: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 811 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

TGCTACGCAG CAATCGCTTT GGTGACAGAT GTGGATGCCG GCGTCGCTGC TGGCGATGGC 60 

GTGAAAGCCG CCGACGTGTT CGCCGCATTC GGGGAGAACA TCGAACTGCT CAAAAGGCTG 120 

GTGCGGGCCG CCATCGATCG GGTCGCCGAC GAGCGCACGT GCACGCACTG TCAACACCAC 180 

GCCGGTGTTC CGTTGCCGTT CGAGCTGCCA TGAGGGTGCT GCTGACCGGC GCGGCCGGCT 24 0 

TCATCGGGTC GCGCGTGGAT GCGGCGTTAC GGGCTGCGGG TCACGACGTG GTGGGCGTCG 300 

ACGCGCTGCT GCCCGCCGCG CACGGGCCAA ACCCGGTGCT GCCACCGGGC TGCCAGCGGG 3 60 

TCGACGTGCG CGACGCCAGC GCGCTGGCCC CGTTGTTGGC CGGTGTCGAT CTGGTGTGTC 4 20 

ACCAGGCCGC CATGGTGGGT GCCGGCGTCA ACGCCGCCGA CGCACCCGCC TATGGCGGCC 4 80 

ACAACGATTT CGCCACCACG GTGCTGCTGG CGCAGATGTT CGCCGCCGGG GTCCGCCGTT 54 0 

TGGTGCTGGC GTCGTCGATG GTGGTTTACG GGCAGGGGCG CTATGACTGT CCCCAGCATG 600 

GACCGGTCGA CCCGCTGCCG CGGCGGCGA^:; CCGACCTGGA CAATGGGGTC TTCGAGCACC 660 

GTTGCCCGnr; GTGCGGCGAG CCAGTCATCT GGCAATTGGT CGACGAAGAT GCCCCGTTGC 72 0 

GCCCGCGCAG CCTGTACGCG GCAGCAAGAC CGCGCAGGAG CACTACGCGC TGGCGTGGTC 780 

GGAAACGAAT GGCGGTTCCG TGGTGGCGTT G 811 

{2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 966 base p^airs 
(R) TYPE: nucleic acid 
{O STRANDEDNESS: -ingle 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

GTCCCGCGAT GTGGCCGAGC ATGACTTTCG GCAACACCGG CGTAGTAGTC GAAGATATCG 60 

GACTTTGTGG TCCCGGTGGC GGGATAGAGC ACCTGTCGGC GTTGGTCAGC GTCACCCGTT 120 

GCTCGGACGC CGAACCCATG CTTTCAACGT AGCCTGTCGG TCACACAAGT CGCGAGCGTA 18 0 

ACGTCACGGT CAAATATCGC GTGGAATTTC GCCGTGACGT TCCGCTCGCG GACAATCAAG 24 0 

GCATACTCAC TTACATGCGA GCCATTTGGA CGGGTTCGAT CGCCTTCGGG CTGGTGAACG 300 

TGCCGGTCAA GGTGTACAGC GCTACCGCAG ACCACGACAT CAGGTTCCAC CAGGTGCACG 360 

CCAAGGACAA CGGACGCATC CGGTACAAGC GCGTCTGCGA GGCGTGTGGC GAGGTGGTCG A 20 

ACTACCGCGA TCTTGCCCGG GCCTACGAGT CCGGCGACGG CCAAATGGTG GCGATCACCG 4 80 

ACGACGACAT CGCCAGCTTG CCTGAAGAAC GCAGCCGGGA GATCGAGGTG TTGGAGTTCG 54 0 

TCCCCGCCGC CGACGTGGAC CCGATGATGT TCGACCGCAG CTACTTTTTG GAGCCTGATT 600 

CGAAGTCGTC GAAATCGTAT GTGCTGCTGG CTAAGACACT CGCCGAGACC GACCGGATGG 660 

CGATCGTGGA TCGCCCCACC GGCCGTGAAT GCAGGAAAAA TAAGAGCCGC TATCCACAAT 7^0 

TCGGCGTCGA GCTCGGCTAC CACAAACGGT AGAACGATCG AGACATTCCC GAGCTGAAGT 7 80 

GCGGCGCTAT AGAAGCCGCT CTGCGCGATT ATCAAACGCA AAATACGCTT ACTCATGCCA 84 0 

TCGGCGCTGC TCACCCGATG CGACGTTTTT GCCACGCTCC ACCGCCTGCC GCGCGACCTC 900 

AAGTGGGCAT GCATCCCACC CGTTCCCGGA AACCGGTTCC GGCGGGTCGG CTCATCGCTT 960 

CATCCT 9t)6 

(2) INF0RMATI0^3 FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENi^TH: 2367 base pairs 

(B) TYPE: nucleic acid 

(C) L:1R.^NDEDNE3S: single 

(D) TOPOLOGY: linear 



(xi) SE'jUErJCE L^ESCRIPTTON: SEQ ID Nrj:1^6: 

CCGCACCGCC CGC/\ATACCG CCAGCGCCAC CGTTACi^GCC GTTTGCGCCG TTGCCCCCGT hi) 

TGCCGCCCGT CCCCCCC^CCC ':CGCCGATGG AGTTCTCArC GCCAAAAGTA CTGi^CGTTGC i;^0 

CACCGGAGCC GCCGTT'^CCG ':CGTCACCGC CAGCCCCGCC GACTCCACCi:^ GCCCCACCGA 1^:0 
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CTCCGCCGCT G^CACCGTTG CCGCCGTTGC CGATCAACAT GCCGCTGGCG CCACCCTTGC 24 0 

CACCCACGCC ACCGGCTCCG CCCACCCCGC CGACACCAAG CGAGCTGCCG CCGGAGCCAC 300 

CATCACCACC TACGCCACCG ACCGCCCAGA CACCAGCGAC CGGGTCTTCG TGAAACGTCG 360 

CGGTGCCACC ACCGCCGCCG TTACCGCCAA CCCCACCGGC AACGCCGGCG CCGCCATCCC 4 20 

CGCCGGCCCC GGCGTTGCCG CCGTTGCCGC CGTTGCCGAA CAACAACCCG CCGGCGCCGC A SO 

CGTTGCCGCC CGCGCCGCCG GTCCCGCCGG CGCCGCCGAC GCCAAGGCCG CTGCCGCCCT 54 0 

TGCCGCCATC ACCACCCTTG CCGCCGACCA CATCGGGTTC TGCCTCGGGG TCTGGGCTGT 600 

CAAACCTi:GC GATGCCAGCG TTGCCGCCGC TTCCCCCGGG CCCCCCCGTG GCGCCGTCAC 660 

CACCGATACC ACCCGCGCCA CCGGCGCCAC CGTTGCCGCC ATCACCGAAT AGCAACCCGC 7 20 

CGGCGCCACC ATTGCCGCCA GCTCCCCCTG CGCCACCGTC GGCGCCGGAG GCGGCACTGG 780 

CAGCCCCGTT ACCACCGAAA CCGCCGCTAC CACCGGTAGA GGTGGCAGTG GCGATGTGTA SAO 

CGAAAGCGCC GCCTCCGGCG CCGCCGCTAC CACCCCCACT GCCGGCGGCT ACACCGTCGG 900 

ACCCGTTGCC ACCATCACCG CCAAAGGCGC TCGC.AATGTC GCCCTGCGCG ACTCCGCCGT 960 

CGCCGCCGTT GCCGCCGCCG CCACCGGCAG CGGCGGTACC GCCGTCACCA CCGGCACCGC 1020 

CGGTGGCCTT GCCCGAGCCT GCCGTCGCGG TGGCACCGTC GCCGCCGGTG CCACCGGTCG 1000 

GCGTGCCGGC AGTGCCATGG CCGCCCGTGC CGCCGTCGCC GCCGGTTTGA TCACCGATGC 1140 

CGGACACATC TGCCGGGCTG TCCCCGGTGC TGGCCGCGGG GCCGGGCGTG GGATTGACCC 1200 

CGTTTGCCCC GGCGAGGCCG GCGCCGCCGG TACCACCGGC GCCGCCATGG CCGAACAGCC 12 60 

CGGCGTTGCC GCCGTTACCG CCCGCACCCC CGATGCCTGC GGCCACGCTG GTGCCGCCGA 1320 

CACCGCCGTT Gi:CGCCGTTG CCCCACAACC ACCCCCCGTT CCCACCGGCA CCGCCGGCCG 1380 

CGGCGGTACC ACCGGCCCCG CCGTTGCCGC CGTTGCCGAT CAACCCGGCC GCGCCTCCGC 14 4 0 

TGCCGCCGGT TTGACCGAAC CCGCCAGCCG CGCCGTTGCC ACCGTTGCCA AACAGCAACC 151.10 

CGCCGGCC':;C GCCAGGCTGC CCGGGTGCCG TCCCGTCGGC GC:GTTTCCG ATCAA:GGGC 1560 

GCCCCAAAAG CGCCTCGGTG GGCGCATTCA CCGCACCCAG CAGACTCCGC TCMCAGCGG 1620 

CTTCAGTGCT GGCATACCGA CCCGCGGCCG CAGTCAACGC CTGCACAAAC TGCTCGTGAA IbHO 

ACGCTGCCAC CTGTACGCTG AGCGCCT:;AT ACTGCCGAGC ATG'.^3CCCC^7 I\AC?jKCCCOo 174 0 

caatcgccgc C':.acacttca tcggcag:cg cagccaccac ttcc:^tcgT';: gggatcgccg laoo 
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CGGCCGCATT AGCCGCGCTC ACCTGCGAAC CAATAGTCGA TAAATCCAAA GCCGCAGTTG 18 60 

CCAGCAGCTG CGGCGTCGCG ATCACCAAGG ACACCTCGCA CCTCCGGATA CCCCATATCG 1920 

CCGCACCGTG TCCCCAGCGG CCACGTGACC TTTGGTCGCT GGCTGGCGGC CCTGACTATG 1980 

GCCGCGACGG CCCTCGTTCT GATTCGCCCC GGCGCGCAGC TTGTTGCGCG AGTTGAAGAC 2040 

GGGAGGACAG GCCGAGCTTG GTGTAGACGT GGGTCAAGTG GGAATGCACG GTCCGCGGCG 2100 

AGATGAATAG GCGGACGCCG ATCTCCTTGT TGCTGAGTCC CTCACCGACC AGTAGAGCCA 2160 

CCTCAAGCTC TGTCGGTGTC AACGCGCCCC AGCCACTTGT CGGGCGTTTC CGTGCACCGC 2220 

GGCCTCGTTG CGCGTACGCG ATCGCCTCAT CGATCGATAA CGCAGTTCCT TCGGCCCAGG 2280 

CATCGTCGAA CTCGCTGTCA CCCATGGATT TTCGAAGGGT GGCTAGCGAC GAGTTACAGC 234 0 

CCGCCTGGTA GATCCCGAAG CGGACCG 2367 
(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

Gin Pro Ala Gly Ala Thr He Ala Ala Ser Ser Pro Cys Ala Thr Val 
1 b 10 15 

Gly Ala Gly Gly Gly Thr Gly Ser Pro Val Thr Thr Glu Thr Ala Ala 
20 25 30 

Thr Thr Gly Arg Gly Gly Ser Gly Asp Val Tyr Glu Ser Ala Ala Ser 
35 40 45 

Gly Ala Ala Ala Thr Thr Pro Thr Ala Gly Gly Tyr Thr Val Gly Pro 
50 55 60 

Val Ala Thr He Thr Ala Lys Gly Ala Arq Asn Val Ala T-eu Arg Asp 
65 70 75 80 

Ser Ala Val Ala Ala Va.l Ala Ala Ala Ala Thr Gly Sor Gly Gly Thr 
8 5 40 95 



Ala Val Thr Thr Gly Thr Ala Gly Gly Leu Ala Arg Ala Cys Arg Arq 
100 105 110 
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Gly Gly Thr Val Ala Ala Gly Ala Thr Gly Arg Arg Ala Gly Ser Ala 
115 120 125 

Met Ala Ala Arg Ala Ala Val Ala Ala Gly Leu lie Thr Asp Ala Gly 
130 135 140 

His lie Cys Arg Ala Val Pro Gly Ala Gly Arg Gly Ala Gly Arg Gly 
145 150 155 160 

He Asp Pro Val Cys Pro Gly Glu Ala Gly Ala Ala Gly Thr Thr G] y 
16h 170 175 

Ala Ala Met Ala Glu Gin Pro Gly Val Ala Ala Val Thr Ala Arg Thr 
180 185 190 

Pro Asp Ala Cys Gly His Ala Gly Ala Ala Asp Thr Ala Val Ala Ala 
195 200 205 

Val Ala Pro Gin Pro Pro Pro Val Pro Thr Gly Thr Ala Gly Arg Ala 
210 215 220 

Gly Thr Thr Gly Pro Ala Val Ala Ala Val Ala Asp Gin Pro Gly Arg 
225 230 235 240 

Ala Ser Ala Ala Ala Gly Leu Thr Glu Pro Ala Ser Arg Ala Val Ala 
245 250 255 

Thr Val Ala Lys Gin Gin Pro Ala Gly Arg Ala Arg Leu Pro Gly Cys 
260 265 270 

Arg Pro Val Gly Ala Val Ser Asp Gin Arg Ala Pro Gin Lys Arg Leu 
275 280 285 

Gly Gly Arg He His Arg Thr Gin Gin Thr Pro Leu Asn Ser Gly Phe 

2 90 2 95 300 

Ser Ala Gly He Pro Thr Arg Gly Arg Ser Gin Arg Leu His Lys Leu 
305 310 315 320 

Leu Val Lys Arg Cys His Leu Tyr Ala Glu Arg Leu He Leu Pro Ser 
325 330 335 

Met Gly Pro Glu Gin Pre- Arg Asn Arg Arg Arg His Phe He Gly Ser 
340 345 350 

Arg Ser His His Phe Arg Arg Arg Asp Arg Arg Gly Arg He Ser Arg 
355 360 365 

Ala His Leu Arg Thr Asn Ser Arg 
370 375 

(2) INFORMATION PGR SEQ ID NOH98: 



(i) st-::ouii;NCE characteristics: 

(A) LENGTH: 2852 tiase pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO:198: 

GGCCAAAACG CCCCGGCGAT CGCGGCCACC GAGGCCGCCT ACGACCAGAT GTGGGCCCAG 60 

GACGTGGCGG CGATGTTTGG CTACCATGCC GGGGCTTCGG CGGCCGTCTC GGCGTTGACA 120 

CCGTTCGGCC AGGCGCTGCC GACCGTGGCG GGCGGCGGTG CGCTGGTCAG CGCGGCCGCG 180 

GCTCAGGTGA CCACGCGGGT CTTCCGCAAC CTGGGCTTGG CGAACGTCCG CGAGGGCAAC 24 0 

GTCCGCAACG GTAATGTCCG GAACTTCAAT CTCGGCTCGG CCAACATCGG CAACGGCAAC 300 

ATCGGCAGCG GCAACATCGG CAGCTCCAAC ATCGGGTTTG GCAACGTGGG TCCTGGGTTG 360 

ACCGCAGCGC TGAACAACAT CGGTTTCGGC AACACCGGCA GCAACAACAT CGGGTTTGGC A 20 

AACACCGGCA GCAACAACAT CGGGTTCGGC AATACCGGAG ACGGCAACCG AGGTATCGGG 480 

CTCACGGGTA GCGGTTTGTT GGGGTTCGGC GGCCTGAACT CGGGCACCGG CAACATCGGT 54 0 

CTGTTCAACT CGGGCACCGG AAACGTCGGC ATCGGCAACT CGGGTACCGG GAACTGGGGC 600 

ATTGGCAACT CGGGCAACAG CTACAACACC GGTTTTGGCA ACTCCGGCGA CGCCAACACG 660 

GGCTTCTTCA ACTCCGGAAT AGCCAACACC GGCGTCGGCA ACGCCGGCAA CTACAACACC 7^0 

GGTAGCTACA ACCCGGGCAA CAGCAATACC GGCGGCTTCA ACATGGGCCA GTACMCACG 7 HO 

GGCTACCTGA ACAGCGGCAA CTACAACACC GGCTTGGCAA ACTCCGGCAA TGTCAACACC 840 

GGCGCCTTCA TTACTGGCAA CTTCAACAAC GGCTTCTTGT GGCGCGGCGA CCACCAAGGC 'H")0 

CTGATTTTCG GGAGCCCCGG CTTrTTC7U\C TCGACCAGTG CGCCi^TCGTC GGGATTCTTC ^^60 

AACAGCGGTG CCGGTAGCGC GTCCGGCTTC CTG.AACTCCG GTGCCAACAA TTCTGGCTTC 1020 

TTCAA:TCTT CGTCGGGGGC CATCGGTAAC TCCGGCCTGG CAAACGCGGG CGTGCTGGTA 1080 

TCGGGCGTGA TCAACTCGGG CMCACC'^TA TGG-^GTTTGT TCAACATGAG CCTGGTGGCC 114 0 

atcacaacgc cggccttgat :t':gggcttc ttcaacaccg gaagcaacat gtcgggattt i;!no 

TTCGGTGC^CC CACCGGTGTT GAATCT'::GGC :t::^CAAACC GGGTrCGTCGT GAA'-ATTCT'": 12^0 

ggcaaggcca acatcggoaa ttagmgatt :tg:;ggaggg gaaa:gtcgg tga:ttcaac 13.^o 

ATCCTTGGCA GCGGChP.C'ZT CGGCAGCGAA AACATCTTGG GCAGCGGCAA CGTCGGCAGC 138 0 
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TTCAATATCG GCAGTGGAAA 

TACAACATCG GATCCGGAAA 

TACAACGTCG GCTTCGGGAA 

TACAACATCG GGTTCGCCAA 

AACCAGCAGG GCTTCAATAT 

TTCAATTCGG GCACCAATAA 

GCA7U\CTCGG GCACCGGGAA 

CTCAATGCTG GCAGCTACAA 

TACAACACGG GCAGCTACAA 

TTCAACGTGG GTGACACCAA 

TTCAATCCCG GCAACGTCAA 

TTGGTGGCGG GCGATAACCA 

ATCCCCATAA ACGAGCAGAT 

ATGATCACGG TCACCGAGGC 

TTCTTCTTCG GCCCGGTCA.^\ 

ACCATCGGCG GACCGACGGT 

ACGATTACCT TCCTCAAGAT 

TCGTCCGGCT TCTTCAACTC 

GGCAGTTCAG GCGTCTGGA.A 

AACCTCGGCT CGCTGCAGT': 

AACACCAGTA CGGTGAACCT 

ACCAACCTGT CCGGCGTGTT 

AACCTGGGCC AGTTG.AACAT 

GTTTCAACAA TCATATCCGG 

GTAAGCGAAT AAACCGA^T ^ GCCGCCTGTC AT 

(2) INFORMATION FOR SEC? ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: SM 3 amino acids 



CCTGGGAAAC 14 4 0 

CGTCGGCGAC 1500 

CACCGGCAAC 15 60 

GTCCGGCGAC 162 0 

CAGCGGCCTG 1680 

CGTCGGCATC 17 40 

TACCGGCATC 18 00 

CACGGGCTTC 18 60 

CACCGGCAAC 1920 

CACGGGCTTC 1980 

CAATGGCTTC 204 0 

CACTCCATTC 2100 

CGGCGGCAAC 2160 

GAGCGGTTTG 2220 

GATCACCCTC 228 0 

GGAGAGCCGC 2 34 0 

CACCAACCCC 24 UO 

CGTCGGCGGC 2 4 60 

GGGTTTCCAG 2b.;0 

GGGCTTTTTC 2 58 0 

CAACATCGGC 2 64 0 

GGGCCTTGCC 2 7 00 

GTTAGATACG 2 7 60 

CCCGGGAAGC 2.820 
28 j2 



CATCGGAGTA 
CCTCGGGATC 
CGCGGGCGAC 
CACCGGCAAC 
TGCTAGCGGC 
CGTTGGCATC 
CTGGGGTATC 
CACGGGCATC 
CACCGGCGGC 
TACCGGCAGC 
TACCGGCGCT 
GGGCCAGATT 
GGTCATTGAC 
CTCGACCGTT 
TCTCAGCGCA 
GACCGTCCCC 
CGATCCGGCG 
GGGCACCGGT 
CAGTGGTTTG 
AGGCTGGGCG 
CTCCACGCCG 
CCGCGGTCCG 
SGGCAGCGC: 
i^TTTTGCGGC 



212 

TTCAATGTCG 
TAC7VACATCG 
TTCAACCAAG 
AACAACATCG 
TGGAACTCGG 
TTCAACGCGG 
GGGAACCCGG 
CTCAACGCCG 
TTC7U\CGTCG 
TATAACCCGG 
TTCGACACGG 
GCCATCGATC 
GTACACAACG 
TTCCCCCAAA 
TCCACGCTGA 
ATCAGCATTG 
CCGGGCATCG 
GGCACATCTG 
AGCAGCGCGA 
Av^/.:CTGGGCA 
GCCAATGTCT 
ACCGGGACGA 
TCGTGCCGA.A 
AGTGCATCAG 



GTTCCGGAAG 
GTTTTGGAAA 
GCTTTGCCAA 
GCATCGGGCT 
GCACCGGGAA 
GCACCGGAAA 
GTACCGACM 
GCGACTTCM 
GTTIACACCAA 
GTGACACCAA 
GCGACTTCAA 
TCTCGGTCAC 
TAATGACCTT 
CCTTCTATCT 
CCGTTCCGAC 
TCGGTGCTCT 
GAAATTCGAC 
GCTTCCAAAA 
TAGGGAATTi; 
ACTCCGTATC 
CGGGCCTGAA 
TTTTCAACGC 
TTCGGCACGA 
ACGAATCGAA 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: iinear 



(x.1) SEQUENCE DESCRIPTION: SEQ ID NO:199: 

Gly Gin Asn Ala Pro Ala lie Ala Ala Thr Glu Ala Ala Tyr Asp Gin 
15 10 15 

Met Trp Ala Gin Asp Val AJ a Ala Met Phe Gly Tyr His Ala Gly Ala 

20 ?.b 30 

Ser Ala Ala Val Ser Ala Leu Thr Pro Phe Gly Gin Ala Leu Pro Thr 
35 40 45 

Val Ala Gly Gly Gly Ala Leu Val Ser Ala Ala Ala Ala Gin Val Thr 
50 55 60 

Thr Arg Val Phe Arg Asn Leu Gly Leu Ala Asn Val Arg Glu Gly Asn 
65 70 75 80 

Val Arg Asn Gly Asn Val Arg Asn Phe Asn Leu Gly Ser Ala Asn He 
85 90 95 

Gly Asn Gly Asn He Gly Ser Gly Asn He Gly Ser Ser Asn He Gly 
100 105 110 

Phe Gly Asn Val Gly Pro Gly Leu Thr Ala Ala Leu Asn Asn He Gly 
115 120 125 

Phe G.ly Asn Thr Gly Ser Asn Asn He Gly Phe Gly Asn Thr Gly Ser 
130 135 140 

Asn Asn He Gly Phe Gly Asn Thr Gly Asp Gly Asn Arg GJ y He Gly 
145 150 155 160 

Leu Thr Gly Ser Gly Leu Leu Gly Phe Gly Gly Leu Asn Ser Gly Thr 
165 170 175 

Gly Asn He Gly Leu Phe Asn Ser Gly Thr Gly Asn Val Gly He Gly 
180 185 190 

Asn Ser Gly Thr Gly Asn Trp Gly Ho Gly Asn Ser Gly Asn Ser Tyr 
195 200 205 

Asn Thr Gly Phe Gly Asn Ser Gly Asp Ala Asn Thr Gly Phe Phe Asn 

210 215 220 

Ser Gly He Ala Asn Thr Gly Val Gly Asn Ala Gly Asn Tyr Asn Thr 
225 230 23!) 240 
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Gly 3er Tyr Asn Pro Gly Asn Ser Asn Thr Gly Gly Phe Asn Met Gly 
245 250 255 

Gin Tyr Asn Thr Gly Tyr Leu Asn S^r Gly Asn Tyr Asn Thr Gly Leu 
260 265 270 

Ala Asn Ser Gly Asn Val Asn Thr Gly Ala Phe He Thr Gly Asn Phe 
275 280 285 

Asn Asn Gly Phe Leu Trp Arg Gly Asp His Gin Gly Leu He Phe Gly 
290 295 300 

Ser Pro Gly Phe Phe Asn Ser Thr Ser Ala Pro Ser Ser Gly Phe Phe 
305 310 315 320 

Asn Ser Gly Ala Gly Ser Ala Ser Gly Phe Leu Asn Ser Gly Ala Asn 
325 330 335 

Asn Ser Gly Phe Phe Asn Ser Ser Ser Gly Ala He Gly Asn Ser Gly 
340 345 350 

Lgu Ala Asn Ala Gly Val Leu Val Ser Gly Val He Asn Ser Gly Asn 
355 360 365 

Thr Val Ser Gly Leu Phe Asn Met Ser Leu Val Ala He Thr Thr Pro 
370 375 380 

Ala Lgu He Ser Gly Phe Phe Asn Thr Gly Ser Asn Met Ser Gly Phe 
385 390 395 400 

Phe Gly Gly Pro Pro Val Phe Asn Leu Gly Leu Ala Asn Arg Gly Val 
405 410 415 

Val Asn He Leu Gly Asn Ala Asn He Gly Asn Tyr Asn He Leu Gly 
420 425 430 

Scr Gly Asn Val Gly Asp Phe Asn He Leu Gly Ser Gly Asn Leu Gly 
4 35 4 4 0 4 4 5 

Ser Gin Asn He Leu Gly Ser Gly A.^n Val Gly Ser Phe Asn He Gly 
450 455 460 

Sei Gly Asn He Gly Val Phe Asn Val Gly Ser Gly Ser Leu Gly Asn 
465 470 475 480 

Tyr Asn He Gly Ser Gly Asn Leu Gly He Tyr Asn He G1y Phe Gly 
485 490 495 

Asn Va.i Gly Asp Tyr Asn Val G!y Phe Gly Asn Ala Gly Asp Phe Asn 
500 bOb 510 

Gin Gly Phe Ala Asn Thr Gly Asn A;:n Asn He Gly Phe Ala Asn Thr 
515 520 525 

Gly Asn Asn Asn He Gly He Gly Leu Ser Gly Asp Asn Gin Gin Gly 
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530 53rj 540 

Phe Asn He Ala Ser Gly Trp Asn Ser Gly Thr Gly Asn Ser Gly Leu 
b45 550 555 560 

Phe Asn Ser Gly Thr Asn Asn Val Gly He Phe Asn Ala Gly Thr Gly 
565 570 575 

Asn Val Gly He Ala Asn Ser Gly Thr Gly Asn Trp Gly He Gly Asn 
580 585 590 

Pro Gly Thr Asp Asn Thr Gly He Leu Asn Ala Gly Ser Tyr Asn Thr 
595 600 605 

Gly Tie Lou Asn Ala Gly Asp Phe Asn Thr Gly Phe Tyr Asn Thr Gly 
610 615 620 

Ser Tyr Asn Thr Gly Gly Phe Asn Val Gly Asn Thr Asn Thr Gly Asn 
625 630 635 640 

Phe Asn Val Gly Asp Thr Asn Thr Gly Ser Tyr Asn Pro Gly Asp Thr 
645 650 655 

Asn Thr Gly Phe Phe Asn Pro Gly Asn Val Asn Thr Gly Ala Phe Asp 
660 665 670 

Thr Gly Asp Phe Asn Asn Gly Phe Leu Val Ala Gly Asp Asn Gin Gly 
675 680 685 

Gin He Ala He Asp Leu Ser Val Thr Thr Pro Phe He Pro He Asn 
690 695 700 

Glu Gin Met Val He Asp Val His Asn Val Met Thr Phe Gly Gly Asn 
705 710 715 7?.0 

Met lie Thr Val Thr Glu Ala Ser Thr Val Phe Pro Gin Thr Phe Tyr 
725 730 735 

Leu Ser Gly Leu Phe Phe Phe Gly Pro Val Asn Leu Ser Ala Ser Thr 
740 745 750 

Leu Thr Val Pro Thr He Thr Leu Thr He Gly Gly Pro Thr Val Thr 
755 760 765 

Val Pro He Ser He Val Gly Ala Leu Glu Scr Arg Thr He Thr Phe 
770 775 780 

Leu Lys He Asp Pro Ala Pro Gly He Gly Asn Ser Thr Thr Asn Pro 
785 790 795 800 

Ser Ser GJ y Phe Phe Asn Ser Gly Thr Giy CUy Thr Ser Gly Phe GJ n 
805 B]0 815 



Asn Val G.l y Gly Gly Ser Ser Gly Val Trp Asn Ser Gly Leu Ser Ser 
820 825 830 



wo 98/16645 



216 



PCT/US97/18214 



Ala lie Gly Asn Ser Gly Phe Gin Asn Leu Gly Ser Leu Gin Ser Gly 
835 840 845 

Trp Ala Asn Leu Gly Asn Ser Val Ser Gly Phe Phe Asn Thr Ser Thr 
850 855 860 

Val Asn Leu Ser Thr Pro Ala Asn Val Ser Gly Leu Asn Asn He Gly 
865 870 875 880 

Thr Asn Leu Ser Gly Val Phe Arg Gly Pro Thr Gly Thr lie Phe Asn 
885 890 895 

Ala Gly Leu Ala Asn Leu Gly Gin Leu Asn He Gly Ser Ala Ser Cys 
900 90b 910 

Arg He Arg His Glu Leu Asp Thr Val Ser Thr I-le He Ser Ala Phe 
915 920 925 

Cys Gly Ser Ala Ser Asp Glu Ser Asn Pro Gly Scr Val Scr Glu 
930 935 940 

(2) INF0R^4ATI0N FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:200: 
GGATCCATAT GGGCCATCAT CATCATCATC ACGTGATCGA CATCATCGGG ACC 5 3 

(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STR/iNDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUErJCE DESCRIPTION: SEQ IE' NO: 201: 
CCTGAATTCA GGCCTCCGTT GCGCCGGCCT CATCTTGAAC GA 
(2) INFORMATION FOR SEQ ID NO: 202: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202 

GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 

(2) INEORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 31 base pairs 
{D) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
(11') TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:203 
CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 
(2) INFORMATION FOR SEQ ID NO:204: 

(i) SEQUENCE CHAP-ACTERISTICS : 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

[l) TC'POLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204 
GGATCCAGCG CTGACATGAA GACCGATGCC GCT 
(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUEN:^F CHARACTERISTICS: 

(A) length:: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STF<.^NDEr'NESS : single 
(E') T-jPOLOGY: linear 



wo 98/16645 PCT/US97/18214 

218 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO:205: 
GGATATCTGC AGAATTCAGG TTTAAAGCCC ATTTGCGA 38 
(2) INFORMATION FOR SEQ ID NO:206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 
CCGCATGCGA GCCACGTGCC CACAACGGCC 30 
(2) INFORMATION FOR SEQ ID NO: 2 07: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 
CTTCATGGAA TTCTCAGGCC GGTAAGGTCC GCTGCGG 37 
{2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUEN':E CHARACTERISTICS: 

(A) LENGTH: 7676 base pairt 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUEN'.TE DESCRIPTION: SEQ ID N(J:208: 
TGGCGA^TGG GACGCGCCCT GTAGCGGCGC ATT7VAGCGCG GCGGGTGTGG TGGTTACGCG 



60 
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CAGCGTGACC GCTACACTTG CCAGCGCCCT AGCGCCCGCT CCTTTCGCTT TCTTCCCTTC 120 

CTTTCTCGCC ACGTTCGCCG GCTTTCCCCG TCAAGCTCTA AATCGGGGGC TCCCTTTAGG 180 

GTTCCGATTT AGTGCTTTAC GGCACCTCGA CCCCAAAAAA CTTGATTAGG GTGATGGTTC 24 0 

ACGTAGTGGG CCATCGCCCT GATAGACGGT TTTTCGCCCT TTGACGTTGG AGTCCACGTT 30 0 

CTTTAATAGT GGACTCTTGT TCCAAACTGG AACAACACTC AACCCTATCT CGGTCTATTC 360 

TTTTGATTTA TAAGGGATTT TGCCGATTTC GGCCTATTGG TTAAAAAATG AGCTGATTTA 420 

ACAAAAATTT AACGCGAATT TTAACAAAAT ATTAACGTTT ACAATTTCAG GTGGCACTTT 4 80 

TCGGGGA7VAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC T.AAATACATT CAAATATGTA 54 0 

TCCGCTCATG AATTAATTCT TAGAAAAACT CATCGAGCAT CAAATGAAAC TGCAATTTAT 600 

TCATATCAGG ATTATCAATA CCATATTTTT GAAAAAGCCG TTTCTGTAAT GAAGGAGAAA 660 

ACTCACCGAG GCAGTTCCAT AGGATGGCAA GATCCTGGTA TCGGTCTGCG ATTCCGACTC 720 

GTCC.AACATC AATACAACCT ATTAATTTCC CCTCGTCAAA AATAAGGTTA TCAAGTGAGA '7 8 0 

AATCACCATG AGTGACGACT GAATCCGGTG AGAATGGCAA AAGTTTATGC ATTTCTTTCC 84 0 

AGACTTGTTC AACAGGCCAG CCATTACGCT CGTCATCAAA ATCACTCGCA TCAACCAAAC 900 

CGTTATTCAT TCGTGATTGC GCCTGAGCGA GACGAAATAC GCGATCGCTG TTAAAAGGAC 960 

AATTACAAAC AGGMTCGAA TGCAACCGGC GCAGGAACAC TGCCAGCGCA TCAACAATAT 102 0 

TTTCACCTGA ATCAGGATAT TCTTCTAATA CCTGGAATGC TGTTTTCCCG GGGATCGCAG 1080 

TGGTGAGTAA CCATGCATCA TCAGGAGTAC GGATAAAATG CTTGATGGTG GGAAGAGGCA 114 0 

T7\AATTCCGT CAGCCAGTTT AGTCTGACCA TCTCATCTGT AACATCATTG GCAACGCTAC 1200 

CTTTGCCATG TTTCAGAAAG AACTCTGGCG CATCGGGCTT CCCATACAAT CGATAGATTG 12 bD 

TCGCACCTGA TTGCCCGACA TTATCGCGAG CCCATTTATA GCCATATAA^A TCAGCATCCA 1320 

TGTTGGAATT TAATCGCGGC CTAGAGCAAG ACGTTTCCCG TTGAATATGG CTCATAACAC 138 0 

CCCTTGTATT ACTGTTTATG TA^V^CAGACA GTTTTATTGT TCATGACCAA AATCCCTTAA 14 40 

CGTGAGTTTT CGTTCCACTG AGi^GTCAGAC CCCGTAGAAA AGATCAAAGG ATCTTCTTGA 1500 

GATCCTTTTT TTCTGCGCGT AA.TCTGCTGC TTGCAAACAA AAAAACGACG i:^CTAGCAGCG 15 60 

GTGCTTTGTT TGCGGGATCA AGAG'^TACCA ACTCTTTTTC GGAAGGTAAC TGGCTTCAGC 162 0 

AGAG:GCAGA TACCAAATA: TGT:CTTCTA GTGTAGCCCT AGTTAGGCCA CCACTTCAAG 168 0 
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AACTCTGTAG CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT GGCTGCTGCC 17 40 

AGTGGCGATA AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC GGATAAGGCG 18 00 

CAGCGGTCGG GCTGAACGGG GGGTTCGTGC ACACAGCCCA GCTTGGAGCG AACGACCTAC 18 60 

ACCGAACTGA GATACCTACA GCGTGAGCTA TGAGAAAGCG CCACGCTTCC CGAAGGGAGA 1920 

AAGGCGGACA GGTATCCGGT AAGCGGCAGG GTCGGAACAG GAGAGCGCAC GAGGGAGCTT 198 0 

CCAGGGGGAA ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT CTGACTTGAG 20 4 0 

CGTCGATTTT TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC CAGCAACGCG 2100 

GCCTTTTTAC GGTTCCTGGC CTTTTGCTGG CCTTTTGCTC ACATGTTCTT TCCTGCGTTA 2160 

TCCCCTGATT CTGTGGATAA CCGTATTACC GCCTTTGAGT GAGCTGATAC CGCTCGCCGC 2220 

AGCCGAACGA CCGAGCGCAG CGAGTCAGTG AGCGAGGAAG CGGAAGAGCG CCTGATGCGG 2280 

TATTTTCTCC TTACGCATCT GTGCGGTATT TCACACCGCA TATATGGTGC ACTCTCAGTA 2 34 0 

CAATCTGCTC TGATGCCGCA TAGTTAAGCC AGTATACACT CCGCTATCGC TACGTGACTG 24 00 

GGTCATGGCT GCGCCCCGAC ACCCGCCAAC ACCCGCTGAC GCGCCCTGAC GGGCTTGTCT 24 60 

GCTCCCGGCA TCCGCTTACA GACAAGCTGT GACCGTCTCC GGGAGCTGCA TGTGTCAGAG 2520 

GTTTTCACCG TCATCACCGA AACGCGCGAG GCAGCTGCGG TAAAGCTCAT CAGCGTGGTC 2 58 0 

GTGAAGCGAT TCACAGATGT CTGCCTGTTC ATCCGCGTCC AGCTCGTTGA GTTTCTCCAG 2 64 0 

AAGCGTTAAT GTCTGGCTTC TGATAAAGCG GGCCATGTTA AGGGCGGTTT TTTCGTGTTT 2700 

GGTCACTGAT GCCTCCGTGT AAGGGGGATT TCTGTTCATG GGGGTAATGA TACCGATGAA 27 60 

ACGAGAGAGG ATGCTCACGA TACGGGTTAC TGATGATGAA CATGCCCGGT TACTGGAACG 28 20 

TTGTGAGGGT AAACAACTGG CGGTATGGAT GCGGCGGGAG CAGAGAAAAA TCACTCAGGG '.IH^O 

TCAATGCCAG CGCTTCGTTA ATACAGATGT AGGTGTTCCA CAGGGTAGCC AGCAGCATCC 2 54 0 

TGCGATGCAG ATCCGGAACA TAATGGTGCA GGGCGCTGAC TTCCGCGTTT CCAGACTTTA 3 0'.)0 

CGAAACACGG AAACCGAAGA CCATTCATGT TGTTGCTCAG GTCGCAGACG TTTTGCAGCA 30 bC) 

GCAGTCGCTT CACGTTCGCT CGCGTATCGG TGATTCATTC TGCTAACGAG TAAGGCAACC 31.:0 

CGGCCAGCCT AGCC-^v3<:;T:':C TCAACGACAG GAGCACGA'i'C AT!3CGCACCC GTGGGGCCGC 3 HO 

iI'ATGgcggl::^ at.^m:,gcct GCTTCTCGCC GAA.^CGTTTG i^TGGCGGGAC CAGTGACGA^ 3 2-Ui 

GGCTTGAGCG AGGG :GTGCA AGATTCCGAA TACCGCAAGC GACAGGCG-^A TCATCGTCGC 330C^ 

GCTCCACCGA AAGCGGTCCT CGCCGAAAAT GACCCAGAGG t^CTGCCGGCA CCTGTCCTAC 3360 
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GAGTTGCAT'^. ATAAAGAAGA CAGTCATAAG TGCGGCGACG ATAGTCATGC CCCGCGCCCA 

CCGGAAGGAG CTGACTGGGT TGAAGGCTCT CAAGGGCATC GGTCGAGATG CCGGTGCCTA 34 8 0 

ATGAGTGAGC TAACTTACAT TAATTGCGTT GCGCTCACTG CCCGCTTTCC AGTCGGGAAA 354 0 

CCTGTCGTGC CAGCTGCATT AATGAATCGG CCAACGCGCG GGGAGAGGCG GTTTGCGTAT 3600 

TGGGCGCCAG GGTGGTTTTT CTTTTCACCA GTGAGACGGG CAACAGCTGA TTGCCCTTCA 3 660 

CCGCCTGGCC CTGAGAGAGT TGCAGCAAGC GGTCCACGCT GGTTTGCCCG AGCAGGCGAA 37 2 0 

AATCCTGTTT GATGGTGGTT AACGGCGGGA TATAACATGA GCTGTCTTCG GTATCGTCGT 37 BO 

ATCCCACTAC CGAGATATCC GCACCAACGC GCAGCCCGGA CTCGGTAATG GCGCGCATTG 3 8 4 0 

CGCCCAGCGC CATCTGATCG TTGGCAACCA GCATCGCAGT GGGTU^CGATG CCCTCATTCA 3 90 0 

GCATTTGCAT GGTTTGTTGA AAACCGGACA TGGCACTCCA GTCGCCTTCC CGTTCCGCTA 3 9 60 

TCGGCTGAAT TTGATTGCGA GTGAGATATT TATGCCAGCC AGCCAGACGC AGACGCGCCG 4 0;":0 

AGACAGAACT TAATGGGCCC GCTAACAGCG CGATTTGCTG GTGACCCAAT GCGACCAGAT 4 0B0 

GCTCCACGCC CAGTCGCGTA GCGTCTTCAT GGGAGAAAAT AATACTGTTG ATGGGTGTCT 414 0 

GGTCAGAGAC ATCAAGAAAT AACGCCGGAA CATTAGTGCA GGCAGCTTCC ACAGCAATGG 4 2 00 

CATCCTGGTC ATCCAGCGGA TAGTTAATGA TCAGCCCACT GACGCGTTGC GCGAGAAGAT 4 2 60 

TGTGCACCGC CGCTTTACAG GCTTCGACGC CGCTTCGTTG TACCATCGAC ACCACCACGC 4 320 

TGGCACCCAG TTGATCGGCG CGAGATTTAA TCGCCGCGAC AATTTGCGAC GGGGCGTGCA 4 3fi0 

GGGCCAGACT GGAGGTGGCA ACGCCAATCA GCAACGACTG TTTGCCCGCC AGTTGTTGTG 4 440 

CCACGCGGTT GGGAATGTAA TTCAGCTCCG CCATCGCCGC TTCCACTTTT TCCCGCGTTT 4 [-00 

TCGCAGAAAC GTGGCTGGCC TGGTTCACCA CGCGGGAAAC GGTCTGATAA GAGACACCGG 4 560 

GATACTCTGC GACATCGTAT AACGTTACTG GTTTCACATT CACCACCCTG AATTGACTCT 4 620 

CTTCCGGGCG CTATGATGCC ATACCGCGAA AGGTTTTGCG CCATTCGATG GTGTCCGGGA 4 6R0 

TCTCGACGCT GTCCCTTATG CGACTCCTGC ATTAGGAAGC AGCCCAGTAG TAGGTTGAGG 4 74 0 

CCGTTGAGCA CCGCCG:CGC AAGGAATGGT GCATGCAAGG AGATGGCGC': CAACAGTGCC 4 8 00 

CCGGCCAGGG CGCCTGl'CAC CATACCCACG CCGAAACAAG CGi:TCATi:iAG CCCGAAGTGG 4 8 60 

CGAGCCCGAT CTTCCC:A^C GGTGATGTCG GGGATATAGG CGCCAGCAAC CGCAGCTGTG 4 920 

GC:;CCGGTGA TGCCGG:CAC GATGCGTCCG. GCGTAGAGGA TCGAGATGTC GATCGGGGGA 4 980 
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AATTAATACG ACTCACTATA GGGGAATTGT GAGCGGATAA CAATTCCCCT CTAGAAATAA 5 04 0 

TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGGCCAT CATCATCATC ATCACGTGAT 5100 

CGACATCATC GGGACCAGCC CCACATCCTG GGAACAGGCG GCGGCGGAGG CGGTCCAGCG 5160 

GGCGCGGGAT AGCGTCGATG ACATCCGCGT CGCTCGGGTC ATTGAGCAGG ACATGGCCGT 5220 

GGACAGCGCC GGCAAGATCA CCTACCGCAT CAAGCTCGAA GTGTCGTTCA AGATGAGGCC 52 8 0 

GGCGCAACCG AGGGGCTCGA AACCACCGAG CGGTTCGCCT GAAACGGGCG CCGGCGCCGG 534 0 

TACTGTCGCG ACTACCCCCG CCTCGTCGCC GGTGACGTTG GCGGAGACCG GTAGCACGCT 5 4 00 

GCTCTACCCG CTGTTCAACC TGTGGGGTCC GGCCTTTCAC GAGAGGTATC CG7y\CGTCAC 5 4 60 

GATCACCGCT CAGGGCACCG GTTCTGGTGC CGGGATCGCG CAGGCCGCCG CCGGGACGGT 5520 

CAACATTGGG GCCTCCGACG CCTATCTGTC GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT 558 0 

GATGAACATC GCGCTAGCCA TCTCCGCTCA GCAGGTCAAC TACAACCTGC CCGGAGTGAG 564 0 

CGAGCACCTC AAGCTGAACG GAAAAGTCCT GGCGGCCATG TACCAGGGCA CCATCAJUUVC 57 00 

CTGGGACGAC CCGCAGATCG CTGCGCTCAA CCCCGGCGTG AACGTGCCCG GCACCGCGGT 57 60 

AGTTCCGCTG CACCGCTCCG ACGGGTCCGG TGACACCTTC TTGTTCACGC AGTACCTGTC 5820 

CAAGCAAGAT CCCGAGGGCT GGGGCAAGTC GCCCGGCTTC GGCACCACCG TCGACTTCCC 5880 

GGCGGTGCCG GGTGCGCTGG GTGAGAACGG CAACGGCGGC ATGGTGACCG GTTGCGCCGA 5 94 0 

GACACCGGGC TGCGTGGCCT ATATCGGCAT CAGCTTCCTC GACCAGGCCA GTCAACGGGG 6000 

ACTCGGCGAG GCCCAACTAG GCAATAGCTC TGGCAATTTC TTGTTGCCCG ACGCGCAAAG 60 60 

CATTCAGGCC GCGGCGGCTG GCTTGGCATC GAAAACCCCG GCGAACCAi^G CGATTTCGAT 6i::0 

GATCGACGGG GCCGCCCCGG ACGGGTACCC GATCATCAAC TACGAGTAGG CCATCGTCAA 6180 

CAACGGGCAA AJVGGACGCCG CCAGGGCGGA GACCTTGCAG GCATTTCTGC AGTGGGGGAT 62 4 0 

CACCGA'^-GGC AAGAAGGGCT CGTTCCTGGA CCAGGTTCAT TTCCAGCCGG TGCCGCCCGC 6300 

GGTGGTGAAG TTGTCTGACG GGTTGATCGC GACGATTTCC AGCGGTGAGA TG.AAGACCGA 63 60 

TGCCGCTACC GTGGCCGAGG AGGCAGGTAA TTTCGAGGGG ATCTCCGGCG ACCTGAAAAC 64 20 

CCAGATCGAC :AGGTGGAGT CGA'I^GGCAGG TTCGTTGCAG GGC':AGTG'.^C G0-;^GCGGGGG 6'U'O 

GGGGACGGCC :;:.:caggg:^' ; ■:^^,gt:;;gtggg cttcca^gm gcaggc^^ata agtagaagca 65 4 0 

GGAACTGGAG :^AGATCTGGA CGAA.TATTGG TGAGGCGGGG GTC'JAATACT GGAGGGGCGA 6600 

GGAGGAGCAG :AGCAGGGGG TGT':GTCGGA AATGGGCTTT GTGGCGAGAA GGGCCGCCTC 6 6 60 
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GCCGCCGTCG ACCGCTGCAG CGCCACCCGC ACCGGCGACA CCTGTTGCCC CCCCACCACC 67 2 0 

GGCCGCCGCC AACACGCCGA ATGCCCAGCC GGGCGATCCC AACGCAGCAC CTCCGCCGGC 6780 

CGACCCGAAC GCACCGCCGC CACCTGTCAT TGCCCCAAAC GCACCCCAAC CTGTCCGGAT 684 0 

CGACAACCCG GTTGGAGGAT TCAGCTTCGC GCTGCCTGCT GGCTGGGTGG AGTCTGACGC 6900 

CGCCCACTTC GACTACGGTT CAGCACTCCT CAGCAAAACC ACCGGGGACC CGCCATTTCC 6960 

CGGACAGCCG CCGCCGGTGG CCAATGACAC CCGTATCGTG CTCGGCCGGC TAGACCAAAA 7 020 

GCTTTACGCC AGCGCCGAAG CCACCGACTC CAAGGCCGCG GCCCGGTTGG GCTCGGACAT 7 08 0 

GGGTGAGTTC TATATGCCCT ACCCGGGCAC CCGGATCAAC CAGGAAACCG TCTCGCTTGA 714 0 

CGCCAACGGG GTGTCTGGAA GCGCGTCGTA TTACGAAGTC AAGTTCAGCG ATCCGAGTAA 7 2 00 

GCCGAACGGC CAGATCTGGA CGGGCGTAAT CGGCTCGCCC GCGGCGAACG CACCGGACGC 7 2 60 

CGGGCCCCCT CAGCGCTGGT TTGTGGTATG GCTCGGGACC GCCAACAACC CGGTGGACAA 73 2 0 

GGGCGCGGCC AAGGCGCTGG CCGAATCGAT CCGGCCTTTG GTCGCCCCCC CGCCGGCGCC 7 3H0 

GGCACCGGCT CCTGCAGAGC CCGCTCCGGC GCCGGCGCCG GCCGGGGAAG TCGCTCCTAC 74 4 0 

CCCGACGACA CCGACACCGC AGCGGACCTT ACCGGCCTGA GAATTCTGCA GATATCCATC 7500 

ACACTGGCGG CCGCTCGAGC ACCACCACCA CCACCACTGA GATGCGGCTG CTMCAAAGC 7 5 60 

CCGAAAGGAA GCTGAGTTGG CTGCTGCCAC CGCTGAGCAA TAACTAGCAT AACCCCTTGG 7 62 0 

GGCCTCTAAA CGGGTCTTGA GGGGTTTTTT GCTGAAAGGA GGMCTATAT CCGGAT 7 67 6 

(2) TNP-^ORMATIOM FOR SEQ ID NO: 209: 

(1.) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 802 amino acids 
(P) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



[x.i) SEQUENCE DESCRIPTION: SEQ ID NO:209: 

Mpt. Gly His His His His His His Val lie Asp lie Tie Gly Thr Ser 
1 b 10 15 

Pro Thr Ser Trp Glu Gin Aia Ala Ala Gi u Ala Val Gin Arg Ala Arg 

20 25 30 
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Asp Ser Val Asp Asp lie Arg Val Ala Arg Val He Glu Gin Asp Met 
35 40 45 

Ala Val Asp Ser Ala Gay Lys He Thr Tyr Arg He Lys Leu Glu Val 
50 55 60 

Ser Phe Lys Met Arg Pro Ala Gin Pro Arg Gly Ser Lys Pro Pro Ser 
65 70 75 80 

Gly Ser Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro 
85 90 95 

Ala Ser Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr 
100 105 110 

Pro Leu Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn 
115 120 125 

Val Thr He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin 
130 135 140 

Ala Ala Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser 
145 150 155 160 

Glu Gly Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala 
165 170 175 

He Ser Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His 
180 185 190 

Leu Lys Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He 
195 200 205 

Lys Thr Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn 
210 215 220 

Leu Pro Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly 
225 230 235 240 

Asp Thr Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly 
245 250 255 

Trp Gly Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val 
260 265 270 

Pro Gly Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys 

275 280 285 

Ala GJ.u Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp 
290 295 300 

Gin Ala Ser Gin Arg Gly Leu Gly Glu Ala Gin Lou Gly Asn Ser Ser 
305 310 315 320 



Gly Asn Pho Leu Leu Pro Asp Ala Gin Scr He Gin Ala Ala Ala Ala 
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325 330 335 

Gly Phe Ala Scr Lys Thr Pro Ala Asn Gin Ala lie Ser Met lie Asp 
340 345 350 

Gly Pro Ala Pro Asp Gly Tyr Pro lie He Asn Tyr Glu Tyr Ala He 
355 360 365 

Val Asn Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala 
370 375 380 

Phe Leu His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp 
385 390 395 400 

Gin Val His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp 
405 410 415 

Ala Leu He Ala Thr lie Ser Ser Ala Glu Met Lys Thr Asp Ala Ala 
420 425 430 

Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He Ser Gly Asp Leu 
435 440 445 

Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly 
450 455 460 

Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala Ala Val Val Arg 
465 470 475 480 

Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu He Ser 
485 490 495 

Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu Glu 
500 505 510 

Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe Val Pro Thr Thr Ala 
515 520 525 

Ala Ser Pro Pro Ser Thr Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro 
530 535 540 

Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro 
545 550 555 560 

Gly Asp Pro Asn Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro 
565 570 575 

Pro Pro Val He Ala Pro Asn Ala Pro Gin Pro Val Arg He Asp Asn 
580 585 590 

Pro Val Gly Gly Phe Ser Phe Ala Leu Pro A.l a Gly Trp Val Glu Ser 
595 600 605 



Asp Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr 
610 615 620 
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Gly Asp Pro Pro Phe Pro Gly Gin Pro Pro Pro Val Ala Asn Asp Thr 
625 630 635 6^10 

Arg lie Val Leu Gly Arg Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu 
645 650 655 

Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu 
660 665 670 

Phe Tyr Met Pro Tyr Pro Gly Thr Arg lie Asn Gin Glu Thr Val Ser 
675 680 685 

Leu Asp Ala Asn Gly Val Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys 
690 695 700 

Phe Ser Asp Pro Ser Lys Pro Asn Gly Gin lie Trp Thr Gly Val He 
705 710 715 720 

Gly Ser Pro Ala Ala Asn Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp 
725 730 735 

Phe Va] Val Trp Leu Gly Thr Ala Asn Asn Pro Val Asp Lys Gly Ala 
740 745 750 

Ala Lys Ala Leu Ala Glu Ser He Arg Pro Leu Val Ala Pro Pro Pro 
755 760 765 

Ala Pro Ala Pro Ala Pro Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala 
770 775 780 

Gly Glu Val Ala Pro Thr Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu 
785 790 795 800 



Pro Ala 
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CLAIMS 

We claim: 

1. A polypeptide comprising an antigenic portion of a soluble 
M tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro- Val-Asp-Ala-Val-lle-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val-Ala-Ala-Leu (SEQ ID NO: 1 15); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 116); 

( c ) Ala- Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID NO: 17); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID NO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Scr-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-lIe-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser(SEQ ID NO: 121); 

( h ) Ala-Pro-Eys-Thr-Tyr-Xaa-Glu-Glu-Leu-Ly s-Gly-Thr- Asp-Thr-Gly 
( SEQ ID NO: 122); 

(i) Asp-Pro- Ala-Ser-A la-Pro- Asp- Val-Pro-Thr- Ala- Ala-Gln-Lcu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-AIa-Asp-Pro-Asn-Val-Ser-Phe-AIa-Asn (SEQ 
ID NO: 123); and 

0) Ala-Pro-Glu-Scr-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 
(SEQ ID NO: 131) 
wherein Xaa may be any amino acid. 
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2. A polypeptide comprising an immunogenic portion of an 
M tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) and 

(b) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 
Asn-Val-His-Leu-Val; (SEQ ID NO: 132), wherein Xaa may be any 
amino acid. 

3. A polypeptide comprising an antigenic portion of a soluble 
M tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, the complements of said sequences, and DNA 
sequences that hybridize to a sequence recited in SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 
96 or a complement thereof under moderately stringent conditions. 

4. A polypeptide comprising an antigenic portion of a M tuberculosis 
antigen, or a variant of said antigen that differs only in conservative substitutions and/or 
modifications, wherein said antigen comprises an amino acid sequence encoded by a DNA 
sequence selected from the group consisting of the sequences recited in SEQ ID NOS: 26-51, 
133, 134, 158-178 and 196, the complements of said sequences, and DNA sequences that 
hybridize to a sequence recited in SEQ ID NOS: 26-51, 133, 134, 158-178 and 196 or a 
complement thereof under moderately stringent conditions. 

5. A DNA molecule comprising a nucleotide sequence encoding a 
polypeptide according to any one of claims 1-4. 
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6. A recombinant expression vector comprising a DNA molecule 
according to claim 5. 

7. A host cell transformed with an expression vector according to claim 6. 

8. The host cell of claim 7 wherein the host cell is selected from the group 
consisting of £. coli, yeast and mammalian cells. 

9. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with one or more polypeptides 
according to any of claims 1 -4; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M tuberculosis infection in the biological sample. 

10. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with a polypeptide having an N- 
terminal sequence selected from the group consisting of sequences provided in SEQ ID NO: 
129 and 130; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 

11. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with one or more polypeptides encoded 
by a DNA sequence selected from the group consisting of SHQ IDNOS: 3, 11, 12, 135, 136, 
151-155, 184-188, 194-195 and 198, the complements of said sequences, and DNA sequences 
that hybridize to a sequence recited in SEQ ID NOS: 3, 11, 12, 135, 136, 15M55, 184-188, 
194-195 and 198; and 
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(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M tuberculosis infection in the biological sample. 

12. The method of any one of claims 9-11 wherein step (a) additionally 
comprises contacting the biological sample with a 38 kD M tuberculosis antigen and step (b) 
additionally comprises detecting in the sample the presence of antibodies that bind to the 
38 kD M tuberculosis antigen. 

13. The method of any one of claims 9-1 1 wherein the polypeptide(s) are 
bound to a solid support. 

14. The method of claim 13 wherein the solid support comprises 
nitrocellulose, latex or a plastic material. 

15. The method of any one of claims 9-1 1 wherein the biological sample is 
selected from the group consisting of whole blood, serum, plasma, saliva, cerebrospinal fluid 
and urine. 

16. The method of claim 15 wherein the biological sample is whole blood 

or serum. 

17. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with at least two oligonucleotide primers in a 
polymerase chain reaction, wherein at least one of the oligonucleotide primers is specific for a 
DNA molecule according to claim 5; and 

(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the oligonucleotide primers, thereby detecting M. tuberculosis infection. 
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18. The method of claim 17, wherein at least one of the oligonucleotide 
primers comprises at least about 10 contiguous nucleotides of a DNA molecule according to 
claim 5. 

19. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with at least two oligonucleotide primers in a 
polymerase chain reaction, wherein at least one of the oligonucleotide primers is specific for a 
DNA sequence selected from the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151- 
155, 184-188, 194-195 and 198; and 

(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the first and second oligonucleotide primers, thereby detecting M. tuberculosis infection. 

20. The method of claim 19, wherein at least one of the oligonucleotide 
primers comprises at least about 10 contiguous nucleotides of a DNA sequence selected from 
the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 
198. 

21. The method of claims 17 or 19 wherein the biological sample is 
selected from the group consisting of whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid and urine. 

22. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes specific 
for a DNA molecule according to claim 5; and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M tuberculosis infection. 
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23. The method of claim 22 wherein the probe comprises at least about 15 
contiguous nucleotides of a DNA molecule according to claim 5. 

24. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes specific 
for a DNA sequence selected from the group consisting of SEQ ID NOS: 3, 1 1, 12, 135, 136, 
151-155, 184-188, 194-195 and 198; and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M tuberculosis infection. 

25. The method of claim 24 wherein the oligonucleotide probe comprises 
at least about 15 contiguous nucleotides of a DNA sequence selected from the group 
consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

26. The method of claims 22 or 24 wherein the biological sample is 
selected from the group consisting of whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid and urine. 

27. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide according to any one of claims 1-4; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M tuberculosis infection in the biological sample. 

28. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 
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fa) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide having an N-terminal sequence selected from the group consisting 
of sequences provided in SEQ ID NO: 129 and 1 30; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M. tuberculosis infection in the biological sample. 

29. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide encoded by a DNA sequence selected from the group consisting 
of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198, the complements 
of said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID 
NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M tuberculosis infection in the biological sample. 

30. The method of any one of claims 27-29 wherein the binding agent is a 
monoclonal antibody. 

31 . The method of any one of claims 27-29 wherein the binding agent is a 
polyclonal antibody. 

32. A diagnostic kit comprising: 

(a) one or more polypeptides according to any of claims 1-4; and 

( b ) a detection reagent. 

33. A diagnostic kit comprising: 

(a) one or more polypeptides having an N-terminal sequence selected from 
the group consisting of sequences provided in SEQ ID NO: 1 29 and 1 30; and 

(b) a detection reagent. 
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34. A diagnostic kit comprising: 

(a) one or more polypeptides encoded by a DNA sequence selected from 
the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 
198, the complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID NOS: 3, 11. 12, 135, 136, 151-155, 184-188, 194-195 and 198; and 

(b) a detection reagent. 

35. The kit of any one of claims 32-34 wherein the polypeptide(s) are 
immobilized on a solid support. 

36. The kit of claim 35 wherein the solid support comprises nitrocellulose, 
latex or a plastic material. 

37. The kit of any one of claims 32-34 wherein the detection reagent 
comprises a reporter group conjugated to a binding agent. 

38. The kit of claim 37 wherein the binding agent is selected from the 
group consisting of anti-immunoglobulins, Protein G, Protein A and lectins. 

39. The kit of claim 37 wherein the reporter group is selected from the 
group consisting of radioisotopes, fluorescent groups, luminescent groups, enzymes, biotin 
and dye particles. 

40. A diagnostic kit comprising at least two oligonucleotide primers, at 
least one of the oligonucleotide primers being specific for a DNA molecule according to 
claim 5. 
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41. A diagnostic kit according to claim 40, wherein at least one of the 
oligonucleotide primers comprises at least about 10 contiguous nucleotide of a DNA 
molecule according to claim 5. 

42. A diagnostic kit comprising a at least two oligonucleotide primers, at 
least one of the primers being specific for a DNA sequence selected from the group consisting 
of SEQ ID NOS: 3, 1 1, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

43. A diagnostic kit according to claim 42, wherein at least one of the 
oligonucleotide primers comprises at least about 10 contiguous nucleotide of a DNA 
sequence selected from the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 
184-188,194-195 and 198. 

44. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe being specific for a DNA molecule according to claim 5. 

45. A kit according to claim 44, wherein the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of a DNA molecule according to claim 5. 

46. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe being specific for a DNA sequence selected from the group consisting 
of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

47. A kit according to claim 46, wherein the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of a DNA sequence selected from the 
group consisting of SEQ ID NOS: 3, 1 1, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

48. A monoclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 
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49. A polyclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 

50. A fusion protein comprising two or more polypeptides according to 
any one of claims 1-4. 

51. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and ESAT-6 (SEQ ID NO: 99). 

52. A fusion protein comprising a polypeptide having an N-terminal 
sequence selected from the group of sequences provided in SEQ ID NOS: 129 and 130. 

53. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and the M tuberculosis antigen 38 kD (SEQ ID NO: 150). 

54. A diagnostic kit comprising: 

(a) one or more fusion proteins according to any one of claims 50-53; and 

(b) a detection reagent. 
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Box I Observations whera certain claims war* lound unsaarchabls (Continuation of item 1 of first sheet) 

This International Search Report has not been established in respect of certain claims under Article I7(2)(a) for the following reasons: 
1. Q Claims Nos.: 

because they relate to subject matter not required to t>e searched by this Authority, namely: 



□ 



Claims Nos.: 

because they relate to parts of the International Application that do not comply with the prescribed requirements to such 
an extent that no meaningful Irrtemational Search can be earned out, spectficalty: 



3. I I Claims Nos.; 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 

Box II Observations where unity of invention is lacking (Continuation of Item 2 of first sheet) 

This International Searching Authonty found multiple inventions in this intemationai application, as follows: 

see continuation-sheet 



1 . I I As aJI required additional search fees were timely paid by the applicant, this international Search Report covers all 
' ' searchable claims. 

2. I I As all searchable claims could be searched without effort justifying an additional fee, this Authonty did not invite payment 

of any additional fee 



3. I I As only some of the required additional search fees were timely paid by the applicant. 
' ' covers only those claims for which fees were paid, speafically claims Nos, : 



:, this Intemationai Search Report 



4. X No required additional search fees were timely paid by the applicant. Consequently, this International Search Report is 
restncted to the invention first mentioned in the claims; it is covered by claims Nos. : 

1,3,5-9,12-18,21-23,26.27,30-32,35-41,44,45.48-51,53,54 all partially 
(subject 1. on next sheet) 



Remark on Protest 



j I The additional search fees were accompanied by the applicant's protest. 
I I No protest accompanied the payment of additional search fees. 
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FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



1. Claims: 1, 3, 5-9, 12-18, 21-23, 25, 27, 30-32, 35-41, 44, 
45, 48-51, 53, 54 all partially. 



A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antigen or a variant, having an N-terminal 
aminoacid sequence as in Seq.ID:115 and/or encoded by a DNA 
molecule as in Seq,ID:96, complements of said sequence or 
sequences hybridizing to it. A DNA molecule comprising a 
sequence encoding said polypeptide. An expression vector 
comprising said DNA molecule, a host cell transformed with 
said expression vector. A method for detecting M. 
tuberculosis infection in a biological sample by detection 
of antibodies binding to said polypeptide or by detection of 
said polypeotide. A method for detecting M. tuberculosis 
infection in a biological sample by detection of said DNA 
seauence. Diagnostic kits thereof. An antibody binding to 
said polypeptide. A fusion protein comprising said 
polypeptide. Diagnostic kit comprising said fusion protein. 



2. Claims: 1, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:116. 



3. Claims: 1, 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 
45, 48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:(l)17 and 25. 



4. Claims: 1, 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 
45, 48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:118 and 24. 



5. Claims: 1, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:119. 



6. Claims: 1, 5-9, 12-18, 21-23, 26, 27, 3G-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but- for Seq.ID:12G. 
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7. Claims: 1, 3. 5-9, 12-18, 21-23, 25, 27, 30-32, 35-41, 44, 

45, 48-51, 53, 54 all partially. 

Same as invention 1 but for 5eq.ID:121 and 52. 

8. Claims: 1, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:122. 

9. Claims: 1, 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 

45, 48-51, 53, 54 all partially. 

I 

Same as invention 1 but for Seq.ID:123 and 94. 

10. Claims: 1, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:131. 

11. Claims: 2, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:124. 

12. Claims: 2, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51. 53, 54 all partially. 

Same as invention 1 but for Seq.ID:132. 

13. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:l. 

14. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44. 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:2. 
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15. Claims: 3, 5-9, 12-18, 21-23, 25, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:4 and 17. 

16. Claims: 3, 5-9, 12-18, 21-23. 25. 27, 3G-32, 35-41. 44, 45. 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:5. 

17. Claims: 3, 5-9, 12-18, 21-23, 26, 27. 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:6. 

18. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41. 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:7. 

19. Claims: 3, 5-9, 12-18, 21-23, 25, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:8. 

20. Claims: 3, 5-9, 12-18. 21-23. 26, 27, 30-32. 35-41. 44, 45, 

48-51, 53. 54 all partially. 

Same as invention 1 but for Seq.ID:9. 

21. Claims: 3, 5-9, 12-18. 21-23. 26. 27. 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.lD:10 and 13. 

22. Claims: 3, 5-9, 12-18, 21-23. 26. 27. 30-32, 35-41. 44. 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:14. 
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23. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:15. 

24. Claims: 3, 5-9, 12-18, 21-23, 26, 27. 30-32. 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for 5eq.ID:16. 

25. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for 5eq.ID:18. 

26. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:19. 

27. Claims: 3. 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:20. 

28. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:21. 

29. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:22. 

30. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:23. / 
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31. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:26. 

32. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:27. 

33. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:28. 

34. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:29. 

35. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:3G. 

36. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41. 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq,ID:31. 

37. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:32. 

38. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:33. / 
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39. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51. 53, 54 all partially. 

Same as invention 1 but for Seq.ID:34. 

40. Claims: 4-9, 12-18, 21-23. 26. 27, 30-32, 35-41, 44, 45. 

48-51. 53, 54 all partially. 

Same as invention 1 but for Seq.ID:35. 

41. Claims: 4-9. 12-18, 21-23. 26, 27, 30-32. 35-41. 44. 45. 

48-51. 53. 54 all partially. 

Same as invention 1 but for Seq.ID:36. 

42. Claims: 4-9, 12-18. 21-23. 26, 27, 30-32. 35-41. 44, 45. 

48-51. 53, 54 all partially. 

Same as invention 1 but for Seq.ID:37. 

43. Claims: 4-9, 12-18, 21-23, 26, 27. 30-32. 35-41. 44, 45, 

48-51, 53. 54 all partially. 

Same as invention 1 but for Seq.ID:38. 

44. Claims: 4-9, 12-18. 21-23. 26, 27, 30-32. 35-41. 44. 45. 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:39. 

45. Claims: 4-9, 12-18. 21-23, 26. 27, 30-32. 35-41. 44. 45. 

48-51. 53. 54 all partially. 

Same as invention 1 but for Seq.ID:4G. 

46. Claims: 4-9. 12-18. 21-23. 26. 27. 30-32. 35-41, 44, 45. 

48-51, 53. 54 all partially. 

Same as invention 1 but for Seq.ID:41. 
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47. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:42. 

48. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:43, 44 and 178. 

49. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44. 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:45. 

50. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq,ID:46. 

51. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention I but for Seq, 10:47. 

52. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45. 

48-51, 53. 54 all partially. 

Same as invention 1 but for Seq. 10:48. 

53. Claims: 4-9, 12-18. 21-23, 26, 27, 30-32, 35-41. 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:49. 

54. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:50. 
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55. Claims: 4-9. 12-18, 21-23, 25, 27. 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. 10:51. 

56. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51. 53, 54 all partially. 

Same as invention 1 but for Seq. ID: 133. 

57. Claims: 4-9, 12-18, 21-23, 26. 27. 30-32, 35-41, 44, 45. 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:134. 

58. Claims: 4-9, 12-18, 21-23, 26, 27. 30-32. 35-41. 44, 45, 

48-51. 53, 54 all partially. 

Same as invention 1 but for Seq. 10:158. 

59. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. 10:159. 

5G. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32. 35-41, 44, 45, 
48-51, 53, 54 a11 partially.' 

Same as invention 1 but for Seq. 10:160. 

61. Claims: 4-9. 12-18. 21-23, 26, 27, 30-32, 35-41. 44, 45. 

48-51, 53, 54 all partially. 

Same as invention I but for Seq. 10:161. 

62. Claims: 4-9, 12-18, 21-23, 26, 27. 30-32. 35-41, 44. 45. 

48-51. 53. 54 all partially. 

Same as invention 1 but for Seq. ID: 162. 
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63. 


Claims: 


4-9, 12-18» 21-23, 26, 27, 30-32, 


35-41, 44, 


45, 






48-51, 53, 54 all partially. 






Same 


as invention 1 but for Seq.ID:163 






64. 


Claims: 


4-9, 12-18, 21-23, 26, 27, 30-32, 


35-41, 44, 


46, 






48-51, 53, 54 all partially. 








Same 


as invention 1 but for Seq. 10:164 


and 165. 




55. 


Claims: 


4-9, 12-18, 21-23, 26, 27, 30-32. 


35-41, 44, 


45, 






48-51, 53, 54 all partially. 






Same 


as invention 1 but for Seq.ID:156 


and 167. 




66. 


Claims: 


4-9, 12-18, 21-23, 26, 27, 30-32, 


35-41. 44, 


45, 






48-51 53 54 all oartial 1 v 






Same 


as invention 1 but for Seq.ID:168 


and 169. 




67. 


Claims: 


4-9, 12-18, 21-23, 26, 27, 30-32, 


35-41, 44, 


45, 






48-51. 53 54 all Dartiallv. 




Same 


as invention 1 but for Seq.ID:170 


and 171. 




68. 


Claims: 


4-9, 12-18, 21-23, 26, 27, 30-32. 


35-41, 44, 


45, 






48-51, 53, 54 all partially. 




Same 


as invention 1 but for Seq. 10:172 


and 173. 




69. 


Claims: 


4-9, 12-18, 21-23, 26, 27, 30-32, 


35-41, 44, 


45, 






48-51, 53, 54 all partially. 




Same 


as invention- 1 but for Seq. ID: 174 


and 175. 




70. 


Claims: 


4-9, 12-18, 21-23, 26, 27, 30-32. 


35-41, 44, 


45, 






48-51, 53, 54 all partially. 




Same 


as invention 1 but for Seq. 10:176 


and 177. 
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1. Claims: 4-9. 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:196. 



72. Claims: 10, 12-15, 28, 30, 31, 33, 35-39, 52, 
54 all partially. 



A method for detecting M. tuberculosis infection in a 
biological sample by detection of antibodies binding to a 
polypeptide having an N-terminal sequence as in Seq.ID:129, 
or by detection of a protein or polypeptide that binds to an 
agent binding to a polypeptide having an N-terminal sequence 
as in Seq.ID:129. Diagnostic kits thereof. A fusion protein 
comprising said polypeptide. Diagnostic kit comprising said 
fusion protein. 



73. Claims: 10, 12-16, 28, 30, 31, 33, 35-39, 52, 
54 all partially. 



Same as invention 72 but for Seq.ID:130. 



74. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46 
47 all partially. 



A method for detecting M. tuberculosis infection in a 
biological sample by detection of antibodies binding to a 
polypeptide encoded by a DNA sequence consisting of 
Seq.ID:3, complements or hybridizing sequences. A method for 
detecting M. tuberculosis infection in a biological sample 
by detection of said DNA sequence. A method for detecting M 
tuberculosis infection in a biological sample by detection 
of a protein or polypeptide that binds to an agent binding 
to a polypeptide encoded by Seq.ID:3, complements or 
hybridizing sequences. Diagnostic kits thereof. 



75. Claims: 11-16, 19-21, 24-26. 29-31, 34-39, 42, 43 46 
47 all partially. 



Same as invention 74 but for Seq. 10:11. 



76. Claims: 11-15, 19-21, 24-26, 29-31, 34-39, 42 43 45 
47 all partially. 



Same as invention 74 but for Seq.ID:12. 
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77. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:135. 

78. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:136. 

79. Claims: 11-16, 19-21, 24-26» 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:151. 

80. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:152. 

81. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43. 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:153. 

82. Claims: 11-15, 19-21, 24-26, 29-31,- 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:154 and 155. 

83. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:184. 

84. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

M all partially. 

Same as invention 74 but for Seq. 10:185. y 
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85. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:185. 

86. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:187. 

87. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as- invention 74 but for Seq.ID:188. 

88. Claims: 11-16, 19-21, 24-26, 29-31, 34-39. 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:194 and 195. 

89. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.JD:198. 
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